Advanced Calculus 


David Fearnley, D. Phil Oxford 


fearnlda@uvu.edu 


Contents 


1 The Field of Real Numbers 
2 Induction 

3 Sequences 

4 Limits and Continuity 

5 Differentiation 

6 Integration 


7 Supplementary Materials for One Variable 
7.1 The Natural Numbers ..... 0.0.0.0. 2 
7.2 Fundamental Theorem of Arithmetic... ................2048. 
fed Decimals 4.44 Si seca chalg ala SA eee ER Re ee eee Bae 
(eA, Cardinality. 4:4 1) ones hod eed re, Se A Me AE A GA BSE ee, he 
7.5 More on Infinite and Sequence Limits ..................04. 
7.6 Topology of the Real Line ..........0... 0.2.2.0 00000008 - 
74> LoHospital’s: Rule: 25 x56 a the lace eta eee Age ee EE ee 
7.8 More on Integration .. 2... 2.2.2.0... 02.0. ee ee 
7.9 Exercises for Supplementary Materials for One Variable ........... 


8 Series 

9 Sequences of Functions 

10 Structure of Euclidean Space 

11 Differentiation in Higher Dimensions 
12 Integration in Higher Dimensions 

13 Vector Fields, Curves and Surfaces 


14 Multivariable Supplemental Materials 
TA Ts NMatTICES he 35 48" oe 2, Sock ein ek hele eee ee baie 8. ae Ree eae ed od a Og 
14;2,Partitions oF Unity: -. 0 3X Saad & dis Bek abs Vee SY tea a Bee 


Chapter 1 


The Field of Real Numbers 


Foreword: The objective of this particular advanced calculus text is to present the basics 
of Advanced Calculus at a level which is appropriate for an undergraduate student who 
has taken few proof classes. The primary objective is not to cover a full grounding in 
the subject, but to cover a sufficient foundation to understand basic ideas without many 
detours, so as to keep what is covered concise and understandable. Since the single variable 
portion of the text is intended as more of a stand-alone text than a foundation for later 
analysis courses, we focus somewhat more on sequence arguments and somewhat less on 
more involved topological ideas. Enrichment topics are listed at the end for those who 
would like to incorporate more of an associated idea than the planned core topics listed. 
We use topological ideas more in the later part of the text. 

As with most advanced calculus and analysis texts, the proofs themselves are informed 
by many sources. While the proofs are written by the author, no theorem or proof approach 
in this text is actually due to the author (they are all standard theorems with presumably 
typical proof approaches as found in assorted texts in the literature, most of which cannot 
really be tracked down to an original author who first presented a given method of argument). 
Though I have mentioned references to a few of the longer arguments that are particularly 
close to those found in other sources, I have not normally made an effort to find original 
sources to particular proofs for most theorems. I apologize in advance to any to whom I 
may not have given adequate attribution. 


Primitive Notions 


Before we discuss the main body of the subject matter, it is probably appropriate to 
mention that all of these proofs assume the rules of logic and the axioms of set theory at 
their foundation. Since we are intentionally attempting not to wander too far into tangents, 
we will just say that mathematics has some fundamental assumptions that it is okay to take 
unions of sets and define subsets of sets consisting of elements satisfying certain statements, 
and take sets of subsets of sets and choose a set of elements consisting of one element from 
each of a collection sets that are known to be non-empty, and that mathematical proofs 
should follow intuitively clear logical assumptions like ”if it is true that whenever statement 
A is true, statement B is true, and it is also true that whenever statement B is true, 
statement C is true, then it must follow that whenever statement A is true, statement C is 
true.” We will not formalize these ideas in this book. 


2 CHAPTER 1. THE FIELD OF REAL NUMBERS 


Definition 1 


The notation « € S indicates that x is an element of the set S. We think of a 
set as a collection of points. Point is often another term we use for an element of 
a set (particularly if that set is the real numbers or Euclidean space of any other 


dimension). We say that S C T (meaning S is contained in T or is a subset of S) 
if every element of S is an element of 7. We use S C T to mean that S' is a proper 
subset of T (a subset which is not equal to T). Two sets are equal if they have the 
same elements. 


After the notion of a set, one of the most fundamental notions is that of a function, 
which we define next. 


Definition 2 


The notation A x B (often read “A cross B”) refers to the set of ordered pairs 
(a,b) so that a € A and be B. This is referred to as the Cartesian product ot A and 
B or as the cross product of A and B. 

A function f : A— B isa subset of A x B so that for every a € A there is exactly 
one b € B so that (a,b) € f. Normally, instead of writing “(a,b) € f” we use the 
notation f(a) = b. In this way, we may think of f as a way of assigning a point of 
B to each point of A. We also say that if f(a) = 6 then b is the image of point a. If 
g:B—-C then we define the composition go f : A— C to be the set of all points 
(a,c) of A x C' so that (a,b) € f and (b,c) € g. This is also written go f = g(f), so 
we would say that go f(a) =c or g(f(a)) =cif (a,c) Egof. 

Associated with a function we can talk about the image of a set or the inverse 
image (or pre-image) of a set. If U C A we define f(U) = {b € B| f(a) = b for some 
a €U}. We define f-'(V) = {a € Al f(a) € V}. Note that the definition of inverse 
of a set does not require a function to be invertible for sets to have inverse images. 
We call f(A) the range of f (denoted ran(f)) and A the domain of f (denoted 
dom(f)). The implied domain of an expression for a function with real numbers as 
the codomain (a function for which the formula gives real valued outputs) is the set 
of real numbers for which the function could be defined by the expression given for 
the function. This is understood to be the domain when no domain is specified. The 
implied domain of f +g and fg is dom(f)Ndom(g) for two real valued functions f, g, 


and the implied domain of f is {x € (dom(f) N dom(g))|g(x) 4 0}, and the implied 


domain of f og is {x € dom(g)|g(x) € dom(f)}. 

We say that f is one to one if whenever « # y and z,y € dom(f), f(x) F fly). 
This can also be stated as saying that if f(x) = f(y) then x = y. 

We say f : A > B is onto (with respect to specified codomain B) if for each b € B 
there is ana € Aso that f(a) = 6. Another way of saying this is that ran(f) = B. A 
function which is one to one and onto is a one to one correspondence. If f: A Bisa 
one to one correspondence then we define the inverse function of f or just the inverse 


of f to be the function f-! : B > A defined by f—! = {(b,a) € B x Al(a,b) € fy}, 
and we say that f is invertible. 


Discussions about sets will frequently require being able to combine or separate them, 
so we next define union and intersection. 


Definition 3 


The notation AU B means the union of A and B, and is the set of points which 
are in at least one of the sets A or B. The notation 4M B means the intersection of 
A and B, and is the set of points which are in both set A and set B. The complement 
of set B in set A is denoted A\ B and is the set of points of A which are not points of 
B. If Ug is a set for every a € J (where J is an unspecified set of indices) then we can 
use the notation ie Ua to denote the union of all sets Ug, which is the set of points 


aed 
in at least one of the U, sets. We can define intersection over arbitrarily many indices 


similarly. A point x € () U, if x is an element of every set Uy. If C is a set of sets 
ae 

then U C refers to the union of all elements of C’ and () C refers to the intersection 

of all elements of C. If N represents the set of natural numbers (to be defined later), 


[o-e) co 
then () A, and a An mean the same thing. Likewise, U A, = U An- 
neN n=1 neNn n=1 
If C = {Ua}aes is a collection of sets so that Ua Ug = 9 for alla # 6 in J then 


we say that Cis a pairwise disjoint collection of sets or that the Ug, sets are pairwise 
disjoint. We may just say the collection of sets is disjoint and leave off the ” pairwise.” 
For instance, we may say that sets A and B are disjoint rather than saying {A, B} 
is a pairwise disjoint collection of sets. 


In general, if we use an unspecified set like J in the statement of a theorem without 
specifying what J is, then it should be understood that the statement is being said to be 
true for any arbitrary set J. It is worth mentioning that the notation A C B is frequently 
used in mathematical texts to mean A C B, but we will not adopt that convention for this 
book. 

Note that from a strictly set construction sort of point of view it is somewhat lacking in 
rigor to simply define an ordered pair and assume it exists. Rather, we would say an ordered 
pair (a,b) is the set {{a,b}, {b}}. However, it is more convenient to write an ordered pair 
as just (a,b). Also, in some texts the term ”mapping” may refer to a continuous function 
or homomorphism, but we will be using the word to refer to any function. 

We have not defined the real numbers, but readers will have a pretty good idea what the 
real number system represents so we will use that set to illustrate the set notions discussed 
above despite the fact that it is not defined yet. So, for the next two examples, we will 
assume that we already understand what the real numbers and natural numbers are. 


4 CHAPTER 1. THE FIELD OF REAL NUMBERS 


Example 1.1. Let A = [—1,4] and let B = (3,5), and let f : R > R be defined by 
f(x) = x7. Determine: 
(a) AUB 
(b) ANB 
(c) A\B 
(a) f(A) 
(e) FA) 
(f) dom(f) 
(9) ran(f) 
(h) Is f one to one? 
(i) Is f onto? 


Solution. (a) AU B = [-1,5) 
(b) AN B = (8, 4] 


(c) A\ B=[-1,3] 

(d) f(A) = [0, 16] 

(e) f-'(A) = [-2, 2] 

(ec) dom(f) =R 

(f) ran(f) = [0, 00). 

(h) f is not one to one because f(x) = f(—a«) for all z € R. 


(i) f is not onto because f is written as f : R > R so that the listed codomain is not 
the same as the range. 


Notice that whether a function is onto is a property of the specified codomain for the 
function. The function itself is the same set of ordered pairs whether we write f : R > R or 
f :R—- [0,co). However, if we use the first notation then the specified codomain is R which 
contains [0,00) as a proper subset. Since ran(f) = [0,00) C R, in the notation f : R-> R, 
the function f is not onto, but if we designate the function as f : R — [0,00) instead then 
the function is onto with respect to the new codomain (even though it is exactly the same 
function). 


ik 
Example 1.2. Let A, = [—1— —,1+4n] for eachn € N (in other words n = 1, 2,3,4,...). 
n 
Find 
[o-e) 
i) Un 


n=1 


Truth Tables, Logic and ZFC 


A statement is symbolized with a letter in prepositional calculus. The symbol A means 
”and” and V means or.” Saying ”statements p and q are both true” is symbolized by pA q 
and is referred to as a conjunction. Saying ”statement p is true or statement q is true (or 
both)” is symbolized by p V q and is referred to as a disjunction. Statements p and q are 
thought of as ”atoms” in these statements, or statements on whose truth the truth of the 
conjunction or disjunction depends. In mathematics ”or” always means "inclusive or,” so 
if either statement is true or both are true then the disjunction is true. We actually very 
rarely talk about the terms ”conjunction” or ” disjunction” except in discrete mathematics 
and formal logic. We are only presenting this brief introduction to give students a notion 
of logical inference to better understand what is meant by a proof. For some students the 
symbols will be useful, for others they will add confusion, but the formality provides a 
structure that illustrates the type of rigor we hope to see in arguments using words. Logical 
negation —p being true means ”p is false.” We assume the following truth table without 
proof (it is a principle of logic we cannot prove, and is essentially how we define when 
”and” and ”or” are true based on their atomic statements). 

All ’statements” in this text that are referred to are assumed to be well defined and 
properly formulated within the language of logic and the set theory and are either true 
or false but not both (we will not refer to sentences like ”this statement is false” as a 
* statement” for example). 


We will refer to this as the Primitive Truth Table: 


HHA yy 


This truth table can be used with other basic statements about logic to derive rules of 
inference under the assumption that any statement can be used as an atomic statement and 
conjunctions and disjunctions placed back into the truth table to determine its truth based 
on the truth of the atomic statements any number of times. We use p —> q to indicate ”if p 
is true then q is true.” This is equivalent to the statement —p V q because if we use a truth 
table then we see that whatever entries of true or false we assign to p and q, ~p V q is true 
whenever p — q is true, and vice versa. Two logical statements a and b are equivalent if 
one is true if and only if the other is true, in which case we sometimes indicate this with the 
symbol a + b. To save time, since our logic development is intended to be fairly brief, we 
are simply going to define ”if p is true then q is true” to mean 7p V q is a true statement, 
which is why we do not have an entry for p > q on our primitive truth table. If p is true 
and 7p V q is a true statement then since —p is false it follows that q must be true. More 
formally, on the following truth table we see that whenever p is true it is true that q is true 
if and only if sp V q is a true statement. This truth table is one in which we use a portion of 
the preceding truth table to determine the truth of =p and then use the —p and q columns 
for the p and qg columns in the first truth table to obtain the entries in the >p V q column. 


6 CHAPTER 1. THE FIELD OF REAL NUMBERS 


p|q|-pVva|-p| 
P(e) i oe 
T|F| F |F 
BD): ae | a 
Be |) a |e 


A statement which is always true is called a tautology, such as pV ap. A statement which 
is never true is a contradiction, such as p/p. Here are a collection of standard rules of 
inference with names. You are not required to memorize the names of these rules. Verifying 
a rule of inference can be done by using the primitive truth table’s assumptions and making 
columns for the statements in the hypotheses using the atoms in the hypotheses and then 
using these larger statements as atoms in the same truth table, showing that whenever the 
full statement which is the hypothesis is true, the conclusion is also true. In this manner, 
we determine that the implication given is a tautology (it is always true) and is thus a valid 
rule of inference. It is also fine to use earlier established rules of inference to derive other 
rules of inference without using a truth table. 


Modus Ponens: (p A (p > q)) > q 
Modus Tollens: (=q A (p + q)) > 7p 

Reductio Ad Absurdum: ((p > q) A (p > 7q)) > 74. 
Noncontradiction: (pA =p) > q 
Double Negation: =7p + p 

Case Analysis: ((p V q) A (pr) A ( 
Disjunctive Syllogism: ((p V q) A (=p 
Constructive Dilemma: ((p > r) A 
Absorption: (p > q) > (p > (pAq 
Hypothetical Syllogism: ((p > q) A 


These are just rules for propositional logic. This is entirely inadequate for arguments in 
mathematics, and in and of itself it models almost none of the proofs in advanced calculus. 
First order logic also includes the ideas of sets and quantifiers. In this discussion we are 
allowed to use statements that depend on a variable so that we can say that a statement 
that is true for some or all elements of a set, and if P is a statement about elements x then 
P(x) means the statement P is true about x. The statement ”for every point x in a set A, 
the statement P(x) is true” is written Vz € A(P(ax)) and ”there exists some x € A so that 
P(x) is true” is written dx € A(P(x)). Similar to the preceding truth table we have as an 
assumption that =(Vz € A(P(x))) © (da € A(=P(x)). A variable x which a statement’s 
truth depends on is bound if it is in a quantifier, and free otherwise. For instance, in the 
statement ”for every real number x it is true that x? > y”, the variable x is bound because 
it is the variable for the quantifier ”for every x”, whereas the variable y is free. By adding 
these quantifiers in, we are now able to make the majority of statements that one sees in 
an advanced calculus book. Generally, an atomic statement is appended for each value in 
a given set about which a statement with a universal quantifier or statement of existence is 
used. 


While it can be useful to use this sort of notation to keep your arguments straight at 
times, most of the time it is easier to understand and easier to prove statements using words, 
but if your proof is not clear enough that it could be replaced by symbolic logic statements 


in an obvious way, then there is a good chance you have not written out a complete proof. 
What things must be included varies depending on background and audience and the place 
in a sequence of arguments in which a proof occurs. For instance, at the beginning of our 
development when we are proving statements about field axioms, care must be taken to 
cite every step and justification and axiom. Later in the course we will not refer to a rule 
when dividing both sides of an equation by a number or cancel two objects in a division 
because it is assumed that this foundation is understood and well known at that point, but 
we should be able to reduce the argument to axioms and logical rules if asked. If we cannot 
then we probably have not written a rigorous argument. 

One consequence of the preceding rules is that when creating a negation of a statement 
(a statement which is true if and only if the original statement is false) we can simply 
replace V be 5 and J by V and replace A by V and V by A and replace all atomic statements 
by their negations. For example, later in this text we will learn that if c is a limit point of 
the domain D of a function f, then the definition of Tim f(x) = L is that for every « > 0 


there is a 6 > 0 so that if 0 < |r—c| <d anda € D then |f(x) —L| < «. If we were to write 
that as a statement in terse logical notation and we let P be the set of positive numbers, 
we could write Ve € P(a6d € P(Vx € D(|f(x) — L| < eV 7(0 < |x —c| < 6)))), with negation 
de € P(Vé € P(Ar € D((0 < |x —c| < 6) A7(|f (x) — L| < €)))). In words, that would be 
”for some € > 0, for each 6 > 0 there is some real number x € D so that 0 < |x —c¢| < 6 but 
| f(a) — L| > «.” Note that ” but” means the same thing as ”and” in mathematics, and the 
main reason to use one rather than the other is to draw attention to certain implications of 
the statement to the reader. 


The Axioms of Zermelo-Frankel Set Theory with Choice (ZFC) are (in words as intuitive 
descriptions, and not in any particular order): 


Axiom 0: Existence. There is a set. 
Axiom 1: Pairing. If A and B are sets then there is a set {A, B}. 


Axiom 2: Union. If C is a collection of sets then there is a set containing every point 
which is an element of at least one element of C. 


Axiom 8: Extensionality. Two sets are equal if and only if they contain the same 
elements. 


Axiom 4: Foundation. Every set A contains an element B which does not share any 
elements with A. 


Axiom 5: Separation. For a well defined statement ¢ about a point (which is true or 
false but not both) for each element of a set S, the subset A = {x € S|¢(x) is true} exists. 


Axiom 6: Infinity. There is a set w containing the empty set as an element having the 
property that for every a € w, the point aU {a} Ew. 


Axiom 7: Schema of Replacement. If S is a set and ¢ is a well defined statement so that 
for each a € S there is exactly one 6 such that ¢(a, 3) is true, then the set T = {6|¢(a, 3) 


8 CHAPTER 1. THE FIELD OF REAL NUMBERS 


is true for some a € S} exists. 


Axiom 8: Power Set. If A is a set then there is a set containing all subsets of A (and 
hence, by Specification, there is a set consisting of exactly the subsets of A, which we refer 
to as the power set of A). 


Axiom 9: Choice. Every set can be linearly ordered in such a way that every non-empty 
subset has a least element. 

This is also written as: If A is a set of non-empty sets then there is a set B consisting 
of exactly one point from each element of A. 


We will not be referring to these axioms or to these logical rules, but in the back of 
our minds it is perhaps lacking in rigor to not at least mention that these concepts are 
being used all the time (every time we define a set or take a union for instance, we are 
making use of an assumption listed above). The two forms of Choice above can be shown 
to be equivalent, but this would detract from the main point of the text (it is a somewhat 
lengthy set theory argument). The second form of the axiom is the one most needed for 
our arguments, but the first form proves the second much more easily than the second form 
can be used to prove the first. In fact, if we just use the well-ordering (first) form of the 
axiom, take all sets in a collection S of sets and then well-order the union of S using the 
axiom, the set consisting of the first element of each of the elements of S (which exists by 
Replacement Schema) is a set of the form specified in the second form of the axiom. Thus, 
though it is less natural, we will use the first form of the axiom as the Axiom of Choice for 
our development though, as mentioned, we never actually refer to it in advanced calculus 
(we just use it without saying so). 


Field Axioms 


Definition 4 


A set S together with functions +,-: S x S > S (called binary operations on S), 
is a field if it satisfies the following requirements, which we will refer to as axioms 
of a field. Note that rather than writing +(a,b) = c we write a+b =c (read a 
plus 6 equals c), and rather than writing -(a,b) = c we write ab = c (read a times 
b equals c), and use parenthesis to indicate order of operations. Operations within 
parentheses are understood to occur first, and otherwise multiplication is understood 
to be applied before addition. Thus, ab + c means +(-(a, b),c), for example. 

For all a,b,c € S the following are true: 

Commutativity: a+ b=b+a and ab= ba 

Associativity: a+ (b+ c) = (a+b) +c and a(bc) = (ab)c 

Distributivity: a(b+c) = ab+ac 

Identity: There are distinct elements 0,1 € S so that for anya € S,a+0=a 
amd (aC) "a. 

Inverses: For any a € S there is a point —a € S (called the additive inverse of a) 


1 
so that a+ —a = 0. If a £ 0 then there is a point — € S (called the multiplicative 
a 


1 
inverse of a) so that (a)(-) =). 


For some, using these things as starting assumptions should be motivated, but any 
attempt to do so will likely be motivation of an intuitive kind. In the real numbers, all 
of the statements above should probably make sense due to past experience with the real 
number system. 

In some of the arguments that follow we use the definition of function for the binary 
options described without explicitly saying so. For instance, it is understood that ifa+b=c 
and a+b = d then we can conclude that c = d, meaning that c and d are the same element of 
S. This is because + is a function, which means it must be true (by definition) that +(a, b) 
is unique. This is important to understand even though it is convention not to mention it 
during arguments of this kind. 


Definition 5 


1 
We write a — = We also write a—b=a+—b. 


While our goal is to develop the field of real numbers, the axioms stated thus far do not 
do so. In fact, it is possible to have a field with only two elements. 


Example 1.3. Show the set consisting of {0,1} with operations 0+0 = 0,0+1=1+4+0=1, 
1+1=0, (0)(0) =0, (0)(1) = (1)(0) = 0 and (1)(1) = 1 is a field. 


10 CHAPTER 1. THE FIELD OF REAL NUMBERS 


Proof. We refer to this field as Zy. Identity, Commutativity and Associativity follow directly 
from the operation definitions. The multiplicative inverse of 1 is 1 and the additive inverse 
of 1 is 1. The additive inverse of 0 is 0. Since all axioms of a field are satisfied, Z is a field. 


As the example above demonstrates, not all fields are the real number system that we 
are trying to develop, and at this point we refrain from calling the elements of the field S 
numbers.” 

We first make an observation that we use in these arguments without referencing the 
justification. That is, that it is true that whenever a = b and a,b and c are elements of 
a field S, it follows that a+c = 6+ c and that ac = bc. This is because addition and 
multiplication are binary operations, which means that they are functions. If a and b are 
the same element of S then since functions have a single output for each input (the image of 
a point is unique for a function), we know +(a,c) is unique, meaning that +(a,c) = +(b,c). 
Likewise -(a,c) = -(b,c) since - is a function as well. It is common to not mention this when 
this property is used, so we just note it here, and after this point we will add or multiply 
both sides of an equation by an element of S and assume it is understood that doing so 
results in another true equation without further justification. 

Likewise, in set theory all objects are sets (even if their elements are not specified) and 
two sets are equal if they have the same elements. Thus, if one set is equal to a second and 
a third it equal to the second then the first and third are equal. This justifies the (never 
mentioned) fact that if a = b and b=c then a=c. 


Theorem 1.1. Let S be a field. The identities 0,1 for S are unique. In other words, if 
bE S anda+b=a for alla € S then b=0, and ifb € S and ab =a for alla € S then 
b=1- 


Proof. Suppose 0 and 0! are additive identities in S. Then 0 = 0+ 


0’ = 0 (by Identity). 
Hence, 0 = 0’. Similarly, if 1,1/ are multiplicative identities then 1 = (1)(1/) = 


1! 


Theorem 1.2. Let S be a field. For eacha€ S ifa+b=0 for some b€ S then b= —a. 
1 

Ifa#0 and ab=1 for somebe S then b=. 

Proof. Let a € S and suppose that s,¢ are both additive inverses of a. Then a+s =0=a+t 


(by Identity), so —a+(a+t) = —a+(a+s), so (—a+a)+s = (—a+a)+t (by Associativity) 
and 0+ s =0++t (by Inverses), so s = t (by Identity). 


1 1 
Similarly, if s,¢ are multiplicative inverses of a non-zero point a then —(as) = —(at), so 
a a 


(~4)(s) = (a)(t) and ls=l1tsos=t. 


Note in the structure of the previous proof that we did not quote every axiom a second 
time in the latter part of the proof. It is fairly normal (and generally accepted) to not quote 
an axiom every time it is used if a similar application of the axiom has already been used 


11 


within the argument. It is also common to not quote the axiom at all once it has been used 
enough times and the audience is understood to be one that is already aware of how the 
axiom is being used. We will continue to quote most uses of axioms for now. 


Theorem 1.3. Let S be a field. For anya € S, a(0) = 0. 


Proof. We know a(0) = a(0+0) = a(0) + a(0) (by Identity and Distributivity), so —a(0) + 
a(0) = —a(0) + (a(0) + a(0), and 0 = (—a(0) + a(0)) + a(0) so 0 = 0 + a(0) = a(0). 


Theorem 1.4. Let S be a field. For any a € S, (—1)(a) = —a 


Proof. Since 1+—1 = 0, by Theorem 1.3 it follows that 0 = a(0) = a(1+—1) = a(1)+a(-1) 
(by Distributivity), so a(—1) is the (unique) additive inverse of a and is therefore —a by 
Theorem 1.2. 


Theorem 1.5. Let S be a field. For any a,b € S, (—b)(a) = —ba 


Proof. By the Theorem 1.4, (—b)(a) = ((—1)b)(a) = —1(ba) = —ba by Associativity. 


11 1 
Theorem 1.6. Let a,b be non-zero elements of a field S. Then Ab ab: 
a 


11 Deed 
Proof. We know that (~ 5) (ad) = (a )(bz) by Associativity and Commutativity, and this 
a a 
is equal to (1)(1) = 1, which means that — 5 is the unique multiplicative inverse of ab which 
a 
1 
kes -- = —. 
makes “7 = = 


1 1 b 
Theorem 1.7. Let a,b be non-zero elements of a field S. Then 3 + os a : 
; a+b 11 enh tie vane 
Proof. From the preceding theorem we know that i. (a+b)( 5) By the Distributive, 
a 
1 1.1 1 
Associative and Commutative properties and Identity, this is (a-)5 + (b5)- = (1)(-) + 


be cheek 


12 CHAPTER 1. THE FIELD OF REAL NUMBERS 


Definition 6 


A field S is ordered if there is a set P C S of elements of S which are referred to 


as being positive numbers, satisfying the following conditions for all a,b,c € S. 
(a) Trichotomy: Exactly one of a € P, —a € P or a = 0 is true for each a € S. 
(b) Closure: If a,b € P thena+be€ P and abe P. 


We use the notation a < b or b > a to mean b—a € P (note that a > 0 is thus the same 
asa—0=a€ P). The notation a < b means that either a = b or a < b. If a < 0 we refer 
to a as being negative or a negative number. We let 2 represent 1+1, 3 denote 2+ 1, and so 
on. Note that there are ordered fields which do not consist of numbers, but we are focused 
on the real line in this course so at this point we will use the term ”number” to refer to an 
element of an ordered field. 


From this point forward we may omit referencing the standard field axioms (other than 
the order axioms) when we use them if their use seems fairly clear. Deciding which axioms 
or theorems should be explicitly stated in theorems depends on the audience and context 
and is difficult, particularly for readers beginning with proof writing. It is never wrong 
to include too much detail, specifying every axiom and theorem. So, when in doubt, it is 
sensible to write more than the reader needs to see. 

For the theorems that follow in this chapter, it is understood that S is an ordered field 
and P is a subset of S consisting of the positive numbers satisfying the order axioms listed, 
and that all numbers stated are elements of S. 


Theorem 1.8. Let S be an ordered field in which a <b andce€ P. Then ca < cb. 


Proof. Since a < b, b—a € P by definition, so c(b — a) € P by Closure, so bc — ac € P, 
which means bc > ac. 


Theorem 1.9. Let S be an ordered field in whicha<bandce S. Thena+c<b+e. 
Proof. Since a < b, b—a€ P so (b+ c) —(a+c) € P by Closure. 


Theorem 1.10. 1 > 0. 


Proof. By the Identity axiom, 1 4 0. Suppose 1 < 0. Then 0—1 = —1 € P, so (—1)(—1) € P 
by Closure, so 1 € P, which contradicts Trichotomy and is therefore impossible. Thus, by 
Trichotomy it must follow that 1 > 0. 


Theorem 1.11. Let S be an ordered field in which a < b and —c€ P. Then ca > cb. 


Proof. Since —c € P, —c(a) < —c(b) by Theorem 1.8. So, —ca < —cb and —ca+(ca+cb) < 
—cb+(ca+ cb) by Theorem 1.9, so cb < ca. 


13 


Theorem 1.12. Let S be an ordered field and leta<b andb<c. Thena<c. 


Proof. Since b— a € P and c—b € P we know that (b— a)+(c—b) = c—a € P by Closure. 


Theorem 1.13. Let S be an ordered field and leta #0. Then a is positive if and only if 
1 


— 1s positive. 
a 


1 1 
Proof. Assume a > 0. Suppose — < 0. Then a— < 0 by Theorem 1.11 so 1 < 0, a 
a a 
contradiction. 
Likewise, if — is positive then, since its multiplicative inverse is a, by the preceding 
a 


argument a@ is positive. 


Definition 7 


A subset A of an ordered field S is said to be bounded above if there is a point 
u € S$ so that u > x for all x € A. If this is the case then we call u an upper bound 
for A. Similarly, A is said to be bounded below if there is a point 1 € S so that 1 <a 
for all x € A. If this is the case then we call! a lower bound for A. A set is bounded 
if it has both an upper and a lower bound. If there is a point t € S which is an upper 
bound of A so that t < u for every upper bound u of A then we call t the least upper 
bound or supremum of A, denoted sup(A) or sup A. If there is a point s € S which 
is a lower bound of A so that s > / for every lower bound / of A then we call s the 
greatest lower bound or infimum of A, denoted inf(A) or inf A. An ordered field is 
said to be complete if every set which is bounded above has a least upper bound. 

If there is a value M € A so that x < M for every x € A then M = sup(A) and 
we say that M = max(A) (also written max A) is the maximum or largest or last 
point of A. 

If there is a value m € A so that x > m for every x € A then m = inf(A) and we 
say that m = min(A) (also written min A) is the minimum or smallest or first point 
of A. 

We may remove parentheses or braces if the notation is less cumbersome. For 
instance, if A is a finite set A = {21,22,23,...,%,} then we may write min(A) = 
MIM} oy oe ee oe OF MIMI. ne oa) Wstead Ob wii 4a bo, 005 dy y 


We use the notation sup f(x) to mean the supremum of f(A). If the variable is 
xrEA 
understood, we may instead write sup f(a). Similarly, we use inf f(x) to mean the 
A rE 


infimum of f(A). If the variable is understood, we may instead write inf (oe 


14 CHAPTER 1. THE FIELD OF REAL NUMBERS 


Definition 8 


An interval is a set I such that if a,b € J anda < x < b then x € I. The open 
interval (a,b) will denote the set of points x satisfying a < x < b. For this text, we 
assume all open intervals listed are non-empty, and when the notation (a, b) is used it 
is implied that a < b. For a < b we let the closed interval [a,b] = {x € Rja < x < b} 
(and when we write [a,b] it is implied that we are stating a < b). A half open 


interval is an open interval plus one of its two end points [a,b) = {x € Rja < x < b} 
or (a,b] = {x € Rla < x < b}. An extended interval is one of: a ray (a set of the 
form (a,co) = {x € R\|z > a}, (—~w,a),= {x € Riz < a}, [a,~w) = {2 € Rix > 
a}, (—oo, a] = {x € R|xz < a}) or R. In each of these expressions, a and b are referred 
to as end points of the interval. A ray containing its supremum or infimum is a closed 
ray and a ray which does not contain its supremum or infimum is an open ray. 


We leave as an exercise to the reader (in the exercises for this chapter) to prove that a 
set I is an interval if and only if it is one of: a closed interval, an open interval, a half-open 
interval or an extended interval. 


Example 1.4. Let A= (0,1). Find: 
(a) A lower bound for A. 
(b) The least upper bound of A 
(c) The greatest lower bound of A. 


Solution. (a) Any number less than or equal to zero would be a lower bound of A. For 
instance, —10 is a lower bound for A. 

(b) The least upper bound of A is 1. 

(c) The greatest lower bound of A is 0. 


Definition 9 


For any set A C S, the coset of A under multiplication by the number c is denoted 


by cA = Ac = {cx € S|x € A}. We also write (—1)A as —A. The coset of A under 
addition by cis A+c=c+A={c+a|z € A}. 


Notice that not all bounded sets in R have maxima or minima, but they all have suprema 
and infima. 


Example 1.5. Let A= {0,1,2}. Find: 
(a) 2A 
(b) -A 
(c) A+5 


15 


Solution. (a) 2A = {0, 2, 4}. 
(b) —A = {—2, —-1, 0}. 
(c) A+5 = {5,6, 7}. 


The following axiom, referred to as the Completeness or Least Upper Bound axiom (or 
the Connectedness axiom) of the real numbers (depending on your audience) is that a set 
which has an upper bound always has a least upper bound. An ordered field which satisfies 
this axiom is called a complete ordered field (there is only one complete ordered field up to 
isomorphism or homeomorphism, but it is not necessary for us to prove that in this text). 
Assuming the completeness axiom is equivalent to assuming as an axiom that all decimals 
represent real numbers in the usual ordering and that the natural numbers are not bounded 
above. 

When presenting calculus explanations to a calculus class, it may be more helpful to 
them to simply assume the aforementioned property that all decimals represent real numbers 
in the usual ordering and that the natural numbers are not bounded above and use this to 
describe why any set which is bounded above has a least upper bound. This is because 
in most calculus classes there isn’t enough time to formally develop every theorem in 
calculus from fundamentals so you don’t begin with teaching them about ordered fields 
and just pretend that the basics of the real number structure were handled in their algebra 
classes (which they were not, but it is convenient to proceed as if they were). If that 
is the assumption then one can essentially let ng.n,n2...nz always be the largest decimal 
terminating at the kth place past the decimal that does not exceed all elements of a bounded 
set S for each natural number k, and then argue that ng.njnon3... is the least upper bound 
for S. 

For our development, however, we will assume the following as the axiom rather than 
the thing to be proven. Discussing why these two approaches are equivalent is addressed 
in the Supplementary Materials chapter (in the Decimals section) at the end of the single 
variable portion of the text for those who are interested. 


Definition 10 


We say an ordered field F' is complete if every non-empty subset of F’ which is 


bounded above has a least upper bound. 


We assume that there is a complete ordered field which we refer to as R as a final axiom. 
Completeness Axiom: There is a complete ordered field R. 


All sets defined hereafter are assumed to be contained in R unless otherwise specified 
(when we reach the multivariable portion of the text we will assume sets to be contained in 
IR”, but for now we assume they are sets of real numbers). 


16 CHAPTER 1. THE FIELD OF REAL NUMBERS 


From this point on we may assume the standard properties that we have proven about an 
order field in the preceding theorems without referencing them, as long as their application 
seems clear. 


Theorem 1.14. Jf AC R and A has a lower bound, then A has a greatest lower bound. 


Proof. Let s be a lower bound for A. Then —s > —«x for every x € A, which means that —A 
is bounded above and has a least upper bound u. This means u > —z and hence —u < x 
for all x € A, so —u is a lower bound for A. If —u </ then u > —/ which means that —1 is 
not an upper bound for —A, so there is some —x € —A so that —l < —x and! > x, where 
x € A. Thus, / is not a lower bound for A, which means that —u is the greatest lower bound 
of A. 


Theorem 1.15. Approximation Property. Let A C R. If A is bounded above and c < 
sup(A) then there is some a € A so that c < a < sup(A). Jf A is bounded below and 
c > inf(A) then there is some a € A so that inf(A) <a<ce. 


Proof. First, assume A is bounded above. Since sup(A) is the least upper bound for A 
and c < sup(A) we know that c is not an upper bound for A, which means that there 
is some a € A so that c < a, and since sup(A) is an upper bound for A it follows that 
c<a<sup(A). 

Next, assume A is bounded below. As before, since inf(A) is the greatest lower bound 
for A and c > inf(A) we know that c is not a lower bound for A, which means that there 
is some a € A so that c > a, and since inf(A) is a lower bound for A it follows that 
inf(A) <a<ce. 


Definition 11 


The absolute value of a is defined by setting |a| = a if a > 0 and |a| = —a ifa < 0. 


We also define the distance from point p to point qg to be |p — q|. 


Theorem 1.16. For anya €R, —|a| <a < al. 


The proof of this theorem is an exercise in the exercises below (and is proven in the 
solutions below). In various points in this text an exercise may be quoted as the exercise 
where it is proven or it may be listed as a theorem and then later proven in an exercised 
and we are not terribly consistent about which convention is used. Essentially, when seeing 
the statement of an exercise is important for understanding a theorem and we think the 
reader might have trouble filling in the missing details, or when a theorem is going to be 
used many times, we may state it as a theorem even if we defer the proof to an exercise. 
In cases where something has already been proven as an exercise before the first time it is 


17 


used (or the result seems likely to be obvious), we are more likely to simply cite the exercise 
itself rather than restate it as a theorem. 


Theorem 1.17. Lete >0 andcE€R. Then, for every x € R, 
(a) |x| < € if and only if -e<au<e 
(b) |x —c| < € if and only ifc-—e<u<cte. 
(c) |x —c| <€ if and only ifc-—e<u<cte 


Proof. (a) First, note that —|xz| < x < |x| by Theorem 1.16. If |z| < € then —e < —|z], 
so —e < —|z| < a < |a| < €. On the other hand, if -e < x < ¢€ then if x > 0 then 
—e<0<|z2| =2 <€ and if x < 0 then |z| = —r < €, so -e << & < 0< |z| < € by Theorem 
1.11. 

(b) By part (a) we know that |x — c| < € if and only if —e < x —c < ¢, which is true if 
and only ifc-e<au<cte. 

(c) By part (b) we need only check the case where |x — c| = €, which happens if and 
only if s —c =€ or x —c = —€ by definition of absolute value. Thus, |x — c| < € if and only 
if ja —c| < € or |x —c| = €, which is true if and only ifc—-e<a2<c+eorz=c—e€Eor 
x =c-+e, which is true if and only ifc—-e<a<ct+e. 


The preceding theorem tells us that we can think of the absolute value of a difference 
as being the same as the idea of distance, essentially. The statement ”|x — c| < €” means 
the same thing as ’the distance from z to c is less than e.” 


Example 1.6. Write {x € R||x — 2| < 4} as an interval. 


Solution: (2 — 4,2 +4) = (—2,6). 


Theorem 1.18. The Triangle Inequality. For any a,b € R: 
(i) |a| + || = |a + 
(ti) |a| — |b| < |a—9| 


Proof. (i) We see from Theorem 1.16 that —|a| <a < |a| and —|b| < 6 < |b], so —(Ja|+ |) < 
a+b < (|a|+ |b|) by Exercise 1.4. Thus, |a + | < Ja] + |b] by Theorem 1.17. 
(ii) By (i), |b + (a — 6)| < || + |a — 4], so Ja] — |b] < ja — 9]. 


The triangle inequality helps us to bound the distance between points if distances 
between intermediate points have known bounds. For instance, the following would be 
shown with the Triangle Inequality: 


3 
Example 1.7. Let a > 0 and let |a — b| < Then prove 5 <b< > using the Triangle 
Inequality. 


18 CHAPTER 1. THE FIELD OF REAL NUMBERS 


3 
Solution: By the Triangle Inequality |b — 0| < Ja — 0| + |a—b| < a+ : = _ By 


the Triangle Inequality (second part), |a| — |b] < |a — b|, which means that a — |b| < 7 so 
a 

= < |b] = 0. 

al) 


Theorem 1.19. A set A is bounded if and only if there is some M > 0 so that—-M <x< M 
for every x € A, which is true if and only if |x| < M for alla e€ A. 


The proof is left as an exercise. We will use this theorem’s result as an equivalent 
form of a set being ” bounded” throughout the remainder of this text (without necessarily 
referencing this theorem). 


Definition 12 


We say a function f : D > R is bounded if ran(f) is bounded. 


Example 1.8. Let f(x) = x? and let g(x) =4— <2 on [0,3]. Show that f(x) + g(x) < 13. 


Solution: First, if x € [0,3] then 2? < 3? by Exercise 1.5, which means that sup f(x) = 9. 
0,3 


Also, if 0 < x < y then —x > —y by Theorem 1.11 so4—2 <4-—y by Theorem 1.9. By 


Exercise 1.19, we know that sup f(x) + g(x) <9+4= 138. 
[0,3] 


19 


Exercises: 


Exercise 1.1. (DeMorgan’s Laws) Let {Aqg}aecz be a collection of subsets of a set X, where 
J is an arbitrary indexing set. Then 


(a) (\(X\ Aa) = X\ (J Aa and 


aed acd 
(0) U(X \ Aa) = X\ () Ae 
acd acd 


In the proofs of the next two exercises, cite each axiom and theorem used in your proof. 


Exercise 1.2. Let a,b,c,d € S, where S is a field, and let a,b #0. Then aoe 


Exercise 1.3. If a,b are non-zero elements of a field and c,d are elements of the field then 
c d cat+bd 


boa ab 


For the proofs of the remaining exercises in this section you are no longer required to 
cite the axioms of a field when they are used, as long as their application is clear in each 
step, but you should cite uses of the order and completeness axioms when these are used. 


Exercise 1.4. Let a < b and letc<d. Thena+c<b+4d. 
Exercise 1.5. Let0<a<b and let0<c<d. Thenac < bd. 
Exercise 1.6. An ordered field has no smallest positive element. 
Exercise 1.7. Let F be an ordered field. If a€ F then a? > 0. 


Exercise 1.8. Let F be an ordered field and let 0 < a < b, for some a,b € F. Then 


0O< : < : 
boa 


Exercise 1.9. Prove that, for anya € R, —|a| <a < al. 


Exercise 1.10. Let F' be an ordered field and let a,b € F. Then |ab| = |a||b]. 


20 CHAPTER 1. THE FIELD OF REAL NUMBERS 


Exercise 1.11. Let F' be an ordered field and leta € F. If |a| < € for everye > 0 in F, 
then a = 0. 


Exercise 1.12. Jf A C R which is bounded above then for any c > 0, the set cA has a least 
upper bound csup(A), and the set —cA has a greatest lower bound —csup(A). 


Exercise 1.13. Prove that if AC R which is bounded below then for any c > 0, the set cA 
has a greatest lower bound cinf(A), and the set —cA has a least upper bound —cinf(A) 


Exercise 1.14. Prove that a set A is bounded if and only if there is some M > 0 so that 
—M <a<M for every x € A, which is true if and only if |x| <M for alla € A. 


Exercise 1.15. If x,y are real numbers so that for every € > 0 it is true that |x — y| < € 
then x = y. 


Exercise 1.16. Let A,B CR be non-empty sets. 

(a) If B is bounded above and, for each x € A, there is a point y € B so that y > x. 
Then sup(A) < sup(B). 

(b) If B is bounded below and, for each x € A, there is ay € B so that y < « then 
inf(A) > inf(B). 


Exercise 1.17. Let A and B be non-empty sets with A C B. 
(a) If B is bounded above then sup(A) < sup(B). 
(b) If B is bounded below then inf(B) < inf(A). 


Exercise 1.18. Let ACR. If A is bounded above then sup(A + k) = sup(A) +k. If A is 
bounded below then inf(A + k) = inf(A) +k. 


Exercise 1.19. Let f,g: [c,d] > R be bounded functions. Then sup f(x)+ sup g(x) > 


x€ [c,d] x€(c,d] 
sup f(x) + g(x) and inf f(x)+ inf f(x) < inf f(x)4+ Q(x). Likewise, inf f(x) + 
xé[c,d] x€ [c,d] x€ [c,d] x€ [c,d] x€lc,d 
ine gy at Fae 
x€ [c,d] x€ [c,d] 


Exercise 1.20. Let f,g: E > R be bounded. Then f(x)g(x) is bounded. 


Exercise 1.21. Let I CR. Then I is an interval if and only if I is one of the following: 
an open interval, a closed interval, a half-open interval or an extended interval. 


21 
Hints: 


Hint to Exercise 1.1. (DeMorgan’s Laws) Let {Aa}acy be a collection of subsets of a set 
X, where J is an arbitrary indexing set. Then 


(a) (\(X\ Aa) =X \ (J Aa and 


acd aed 
(b) U(X \ Aa) = X\ 0) Aa 
acd aed 


Write down the definition of what it means for x to be in the left and right sides of the 
equations, respectively. 

In each case, let x be an element of the set in the left side of the equation. Explain why 
that means x is an element of the set on the right side of the equation. Then let x be an 
arbitrary point in the set on the right side of the equation and explain why that means x 
is in the set on the left side of the equation. 


d d 
Hint to Exercise 1.2. Leta,b,c,d € S, where S is a field, and leta,b #0. Then a = - 


Remember that the definition of ; is c times the multiplicative inverse of b. Also recall 


Hl 1 
that it has already been shown that ab ab: Write down what “ means and look through 
the field axioms. 


Hint to Exercise 1.3. If a,b are non-zero elements of a field and c,d are elements of the 


lia 
b a ab 
1 1 a+b : 
Remember that we have already shown that 5 + oo Ee Try to use field axioms 


ca + bd 


and the definition of 
ab 


Hint to Exercise 1.4. Let a < b and letc<d. Thena+c<b+d. 


First, show that a+c< 6+ cand then show that b+c<b+d. 


Hint to Exercise 1.5. Let0<a< b and let0<c<d. Thenac < bd. 


Use Theorem 1.8. 


Hint to Exercise 1.6. An ordered field has no smallest positive element. 


1 
You might want to start by proving that 0 < 3 <1. 


22 CHAPTER 1. THE FIELD OF REAL NUMBERS 


Hint to Exercise 1.7. Let F be an ordered field. If a € F then a? > 0. 


Try breaking the problem down into cases. Trichotomy says a > 0 or a = 0 or a < 0. 
Prove the result in each case separately. 


Hint to Exercise 1.8. Let F be an ordered field and let 0 < a <b, for some a,b € F. 
1 1 
ThenOQ< =< -. 
boa 

See if you can find a positive number to multiply the terms in the inequality 0 <a <b 


; 1 1 
by to arrive at 0 < b <n. 
a 


Hint to Exercise 1.9. Prove that, for anya € R, —|a| <a< |al. 


Try looking at the problem in cases. By Trichotomy, a > 0 or a < 0 ora =0. 


Hint to Exercise 1.10. Let F be an ordered field and let a,b € F. Then |ab| = |al|b]. 


Try looking at the problem in cases. By Trichotomy, a > 0 or a < 0 or a= 0, and b > 0 
or b<O0orb=0. 


Hint to Exercise 1.11. Let F' be an ordered field and let a € F. If |a| < € for everye > 0 
in F, thena=0O. 


Suppose a £ 0. What can be said about |a|? Does this contradict the hypothesis? 


Hint to Exercise 1.12. Jf A C R which is bounded above then for any c > 0, the set cA 
has a least upper bound csup(A), and the set —cA has a greatest lower bound —csup(A). 


First, take an element x € A and explain why cx < csup(A). Then, suppose u is an 
upper bound for cA and explain why " is an upper bound for A. Use this to explain why 


csup(A) is the least upper bound of cA. Then do something similar for the other half of 
the theorem. 


Hint to Exercise 1.13. Prove that if A C R which is bounded below then for any c > 0, 
the set cA = {cx € R\a € A} has a greatest lower bound cinf(A), and the set —cA has a 
least upper bound —cinf(A) 


The argument is very similar to that described for the proof of the preceding exercise. 


Hint to Exercise 1.14. Prove that a set A is bounded if and only if there is some M > 0 
so that -M <a<M for every x € A, which is true if and only if |x| < M for alla € A. 


Write down the definition of what it means for a set to be bounded. Explain why this 
means the absolute values of the elements of the set are bounded. Then explain why the 
absolute values of elements of a set being bounded would imply that the set is bounded. 
You might also consider using Theorem 1.16. 


23 


Hint to Exercise 1.15. If x,y are real numbers so that for every € > 0 it is true that 
lc —y|<ethenz=y. 


You could use Theorem 1.17, or methods similar to the proof of that theorem. 


Hint to Exercise 1.16. Let A,B CR be non-empty sets. 

(a) If B is bounded above and, for each x € A, there is a point y € B so that y > x. 
Then sup(A) < sup(B). 

(b) If B is bounded below and, for each x € A, there is ay € B so that y < «x then 
inf(A) > inf(B). 


Start with an upper bound for B and explain why that is also an upper bound for A. 
For the second part, start with a lower bound for B and explain why that is also a lower 
bound for A. 


Hint to Exercise 1.17. Let A and B be non-empty sets with A C B. 
(a) If B is bounded above then sup(A) < sup(B). 
(b) If B is bounded below then inf(B) < inf(A). 


Show that an upper bound for B is an upper bound for A and that a lower bound for 
B is a lower bound for A, and then explain why this implies the conclusion (consider the 
definitions of least upper bound and greatest lower bound). 


Hint to Exercise 1.18. Let AC R andk € R and define A+k={x+keR\ace A}. If 
A is bounded above then sup(A+k) = sup(A) +k. If A is bounded below then inf(A +k) = 
inf(A) — k. 


Try to parallel the strategy of Exercise 1.12 somewhat, using addition and subtraction 
instead of multiplication and division. 


Hint to Exercise 1.19. Let f,g : [c,d| > R be bounded functions. Then sup f(a) + 
x€[c,d] 


sup g(x) > sup f(x)+ g(x) and inf f(x)+ inf f(x) < inf f(x) +(x). Likewise, 
x€ [c,d] x€ [c,d] x€ [c,d] x€ [c,d] xé [c,d] 


inf inf < inf . 
sce TT cag) S aheg +910) 


Try to explain why f(z)+9(z) > inf f(x)+ inf g(a) for each z € [c,d]. 


i 
x€[c,d] x€(c,d| 


Hint to Exercise 1.20. Let f,g: E — R be bounded. Then f(x)g(x) is bounded. 


You could use Theorem 1.19. 


Hint to Exercise 1.21. Let J C R. Then I is an interval if and only if I is one of the 
following: an open interval, a closed interval, a half-open interval or an extended interval. 


24 CHAPTER 1. THE FIELD OF REAL NUMBERS 


Start out with the definitions of each type of interval listed and explain why they satisfy 
the definition of being an interval, which is that an interval is a set containing every point 
which is between any two elements of that set. Then, use the definition of an interval and 
break down the possible cases of the interval being bounded above, bounded below, both 
above and below and neither bounded above nor below, and explain why these cases each 
imply that an interval is one of the listed types of interval. 


25 


Solutions: 


Solution to Exercise 1.1. (DeMorgan’s Laws) Let {Aa}acy be a collection of subsets of 
a set X, where J is an arbitrary indexing set. Then 


(a) (\(X\ Aa) =X \ (J Aa and 


acd acd 
(0) U(X \ Aa) = X\ 0) Aa 
acd acd 


Proof. (a) Let x € () (X \ Aq). Then for every a € J it follows that x € X and x ¢ Ay 


aed 
which means that « € X \ U Ag. 
aed 
Let x € X \ U Aj. This means that x € X and z is not in any Ag which means that 
aed 
xz € X \ Aq for every a € J and thus xz € () (X \ Ao). 
aed 
Hence, (| (X \ Aa) = X \ L) Ac. 
acd acd 
(b) Let x € (a (X \ Aq). Then for some 8 € J it follows that « € X and x ¢ Ag. This 
acd 
means that x ¢ () Ag and thus z € X \ () Ag: 
acd acd 
Let x € X \ () Aq. Then x € X and since x ¢ () Aq, there is some 6 € J so that 
acd acd 
x ¢ Ag. Thus, « € X \ Ag which means that x € U (X \ Ag). 


acd 


Thus, (_)(X \ Ao) = X \ () Aa. 


acd acd 


Solution to Exercise 1.2. Let a,b,c,d € S, where S is a field, and let a,b 4 0. Then 


11 1 d 1 1 
Proof. Since we have already shown that —— = —, we simply observe that Ee (c—)(d=) 
ab ab ab a’. b 


11 
by definition, which is (cd)(~ 5) = (cd) — by the Associative and Commutative properties, 


which is the same as 


a by definition. 
ab 


Solution to Exercise 1.3. Jf a,b are non-zero elements of a field and c,d are elements of 


the field then = + ¢ es og, 
boa ab 
Pees ; 11, 11 
Proof. We know that ca+bd = ca+bd, so multiplying both sides by a5 vs 5 (ca+bd) = 


11 11 1 
——(ca + bd), and since we have shown that —— = — we can rewrite this as ——(ca) + 
ab ab ab ab 


26 CHAPTER 1. THE FIELD OF REAL NUMBERS 


Ld , ca + bd 


= 5 (a = 5 By Commutativity and Associativity we can simplify the left side of 
a a 
1 1 1 eee | 
this equation fo (cz )(a )+(d )(b5). By Inverses and Identity this further simplifies to 
a a 
C! a c ad ca+bd 
-~+-. Th -—-= : 
b - a aa as ab 


Solution to Exercise 1.4. Let a < b and letc<d. Thenat+c<b+d. 


Proof. By Theorem 1.9 we know a+c<b+c. By the same theorem, )+c<b+d. Thus, 
by Theorem 1.12, we know that a+c<b+4d. 


Solution to Exercise 1.5. Let0<a< _b and let0<c<d. Thenac < bd. 


Proof. If a = 0 or c= 0 then ac = 0 and since c,d > 0 we know that cd > 0 by Closure, so 
ac < bd. Otherwise a,c > 0, in which case we know ca < cb and bc < bd by Theorem 1.8, 
which means that ac < bd by Theorem 1.12. 


Solution to Exercise 1.6. An ordered field has no smallest positive element. 


Proof. First, note that since we have shown 1 > 0, it follows that 1+1 > 1s02 > 1 and 2 is 


1 
positive. Hence, it follows from Theorem 1.13 that 3 is positive, which means that 2(5) > 


1 1 1 1 
\(5) s0 5 < 1. Let a > 0. Then since 0 < = < 1 we know that O(a) < a(5) < a(1), so 5 is 


a positive number which is less than a. Thus, there is no smallest positive number. 


Solution to Exercise 1.7. Let F be an ordered field. If a € F then a? > 0. 


Proof. If a > 0 then (a)(a) > 0 by the Closure axiom for the positive numbers. If a = 0 
then (a)(a) = 0. If a < 0 then —a > 0 so (—a)(—a) > 0 by Closure again, so (—a)(—a) = 
(—1)(—1)(a)(a) = (1)(a?) = a? > 0. By Trichotomy, these are the only possible cases, so 
a’ > 0 for every a€ F. 


Solution to Exercise 1.8. Let F be an ordered field and let0 <a <b, for somea,beé F. 


eee 
b a 


Proof. Since a,b are positive, ab > 0 by the Closure axiom for the positive numbers. Thus, 


1 1 11 
by Theorem 1.13, we know that — > 0. Also, we have shown that — = ——. Thus, since 


ab ab 
1 1 
a < 6 it follows that ake < abe which means 3 < —. Finally, since multiplying two 
a a a 
1 
positive numbers always results in a positive number by Closure, we know that 0 < — so 


1 1 ? 
0O<-<-. 
boa 


27 


Solution to Exercise 1.9. Prove that for anya €R. —|a| <a < al. 


Proof. If a > 0 then |a| = a > 0 > —a = —|a|. If a < 0 then ja] = -a > 0 >a = —al. 
Thus, in each possible case the theorem is true. 


Solution to Exercise 1.10. Let F be an ordered field and let a,b € F'. Then \ab| = |a||b|. 


Proof. If a > 0 and b > 0 then |a||b| = ab = |ab|. If a > 0 and b < 0 then |a||b| = a(—b) = 
—ab = |ab|. If a < 0 and b < 0 then |ab| = ab = (—a)(—b) = |a||b|]. All possible cases are 
addressed in these three since the case where a < 0 and b > 0 is handled in the second case 
by renaming a and b, so this completes the proof. 


Solution to Exercise 1.11. Let F' be an ordered field and leta € F. If \a| < € for every 
€>O0inF, thena=0. 


Proof. If a > 0 then |a| = a > 0 which is impossible since |a] < a by assumption. If a < 0 
then |a| = —a > 0 which is, again, impossible, since |a| < —a by assumption. Hence, 
a= 0. 


Solution to Exercise 1.12. Jf AC R which is bounded above then for any c > 0, the set 
cA has a least upper bound csup(A), and the set —cA has a greatest lower bound —csup(A). 


Proof. For every x € A we know that x < sup(A), so cx < csup(A), which is an upper 
bound for cA. If u is an upper bound for cA then for every x € cA it must follow that a < 2 
cc 


u 
so ~ is an upper bound for A. This means that — > sup(A) and therefore u > csup(A), so 
c 


c 
csup(A) is the least upper bound of cA. 

For every x € A we know that x < sup(A), so —cx > —csup(A), which is a lower bound 
for —cA. If / is a lower bound for —cA then for every x € A we know that —cx € —cA 


l l 
so —cx > 1, which means that x < — so — is an upper bound for A. This means that 
—¢ —Cc 


ae > sup(A) and therefore | < —csup(A), so —csup(A) is the greatest lower bound of 


—cA. 


Solution to Exercise 1.13. Prove that if A C R which is bounded below then for any 
c > 0, the set cA has a greatest lower bound cinf(A), and the set —cA has a least upper 
bound —cinf(A). 


28 CHAPTER 1. THE FIELD OF REAL NUMBERS 


Proof. For every x € A we know that x > inf(A), so cx > cinf(A), which is a lower bound 


for cA. If 1 is a lower bound for cA then for every x € cA it must follow that = > : so . is 
cc. Cc 


a lower bound for A. This means that é < inf(A) and therefore | < cinf(A), so cinf(A) is 


the greatest lower bound of cA. 

For every x € A we know that x > inf(A), so —cx < —cinf(A), which is an upper bound 
for —cA. If u is an upper bound for —cA then for every x € A we know inet —cx € —cA 
so —cxz <u, so x > — so is a lower bound for A. This means that = < inf(A) and 


therefore u > —cinf(A), so —cinf(A) is the least upper bound of —cA. 


Solution to Exercise 1.14. Prove that a set A is bounded if and only if there is some 
M >0 so that -M <a <M for every x € A, which is true if and only if |x| < M for all 
ZEA. 


Proof. First, let M > 0. Then |a| < M for each x € A if and only if -M <a < M for each 
x € A by Theorem 1.17. 

Assume that A is bounded. Then there are bounds a,b for A so that a < x < 6b for 
each « € A. Let M = max(|a|,|b|). By Theorem 1.16, for each x € A it is true that 
—M <|a| <a<a2<b< |b] < M, which means that |z| < M. 

If M is a positive number so that —M <a < M for all x € A then —M is a lower bound 
for A and M is an upper bound for A, so A is bounded. 


Solution to Exercise 1.15. If x,y are real numbers so that for every € > 0 it is true that 
la —y|<e thenz=y. 


Proof. By an earlier exercise, |x — y| = 0. If x -—y > 0 then |x — y| = x —y > 0 and if 
y—x > Othen |x—y| =y—2>0. Thus, it is false that x — y is positive or negative, so 
by Trichotomy we conclude that « — y = 0, so x = y. 


Solution to Exercise 1.16. Let A,B CR be non-empty sets. 

(a) If B is bounded above and, for each x € A, there is a point y € B so that y > x. 
Then sup(A) < sup(B). 

(b) If B is bounded below and, for each x € A, there is ay € B so that y < «x then 
inf(A) > inf(B). 


Proof. (a) Let w= sup(B). Then for each a € A we know that there there is some b € B so 
that a < 6 < u which means that a < u, so u is an upper bound for A. Hence, sup(A) < w. 

(b) Let / = inf(B). Then for each a € A there is some b € B so that | < b < a, which 
means that | is a lower bound for A. Hence, | < inf(A). 


29 


Solution to Exercise 1.17. Let A and B be non-empty sets with A C B. 
(a) If B is bounded above then sup(A) < sup(B). 
(b) If B is bounded below then inf(B) < inf(A). 


Proof. (a) For each x € A we know that « € B, which means that x < sup(B). Thus, 
sup(B) is an upper bound for A, so sup(A) < sup(B). 

(b) For each « € A we know that x € B, which means that x > inf(B). Thus, inf(B) is 
a lower bound for A, so inf(A) > inf(B). 


Solution to Exercise 1.18. Let A C R and let A+-k = {a+k € Rla € A}. If A is bounded 
above then sup(A +k) =sup(A) +k. If A is bounded below then inf(A +k) = inf(A) +k. 


Proof. Let x € A. Then x < sup(A) so x+k < sup(A) + k, which is an upper bound of 
A+k. Let u be an upper bound of 4+ k. Then for any x € A we know that r+k < u 
so x < u-—k, which is an upper bound for A. Thus, sup(A) < u—k so sup(A) +k < u, 
making sup(A) + & the least upper bound of A + k. 

Next, if « € A then x > inf(A) sox +k > inf(A) +k, which is a lower bound of A+ k. 
Let / be an upper bound of A+. Then for any x € A we know that r+k >lsox>I1-k, 
which is a lower bound for A. Thus, inf(A) >1—& so inf(A) + k > 1, making inf(A) +k 
the greatest lower bound of A+ k. 


Solution to Exercise 1.19. Let f,g: [c,d| > R be bounded functions. Then sup f(x) + 
x€[c,d] 


sup g(x) > sup f(x)+ g(x) and inf f(x)+ inf f(x) < inf f(x) +9(2). Likewise, 
x€ [c,d] x€ [c,d] x€ [c,d] x€ [c,d] x€ [c,d] 


inf inf < inf : 

ee f(x) + eae g(@) < oe f(x) + g(x) 

Proof. Let Si = ae f(x) and S, = to): For every x € [c,d] we know that f(x) < 
xvElc, rElc, 


Sy and g(x) < Sj, which means that f(a) + g(x) < S¢+ S,, which is an upper bound for 
{f(x) +9(«)|x € [c,d]}. This means that Sy + Sy is greater than or equal to the least lower 
bound of { f(x) + g(x)|x € [c,d]}, which is sup f(x) + 9(z). 


x€ [c,d 


Let Ip = ae f(x) and I, = ee: For every x € [c,d] we know that f(x) > If 
rElc, rElc, 


and g(x) > Ig, which means that f(x) + g(x) > If + Ig, which is a lower bound for 

{ f(x) + 9()|x € [e,d]}. This means that I + J, is less than or equal to the greatest lower 

bound of { f(x) + g(x)|x € [c,d]}, which is a f(x) + g(a). 
x€e, 


Solution to Exercise 1.20. Let f,g: E > R be bounded. Then f(x)g(x) is bounded. 


Proof. Since f and g are bounded, there are numbers My, M, so that |f(x)| < My and 
|g(a)| < Mg, for all x € E. Hence, | f(x)g9(x)| = |f(x)||g(x)| < MyM, for all « € E, which 
means that fg is bounded on E. 


30 CHAPTER 1. THE FIELD OF REAL NUMBERS 


Solution to Exercise 1.21. Let [ CR. Then I is an interval if and only if I is one of the 
following: an open interval, a closed interval, a half-open interval or an extended interval. 


Proof. First, assume that J is an open or closed interval with end points a < b. Ifa =b 
then J consists of at most one point and is an interval vacuously. Otherwise, if c,d € I with 
c<dthena<c<d<b,soifc<a<dthena< 2x < b which means z € I by definition 
of open and closed interval. Next, assume that I is a ray with end point a. If I = (a,0oo) or 
[a,oo) anda <c<dandc< zz < d then again, x € I by definition since a < x. Similarly, 
if [ = (—co,a) or I = (—co, a] then J is an interval since if c << x << d<a then x < aso 
x € I. Finally, if J = R then all real numbers are in J so J is an interval. 

Next, assume that J is an interval. Then if J is neither bounded above nor below then 
for any real x there are points c,d € I so that c < x < dso x € IJ, which means that 
I =R. If J is bounded below but not above then let a = inf(J). If « > a then z is not 
a lower bound for J, which means that for some point c < x it is true that c € I by the 
Approximation Property. Since J is not bounded above there is some d > x so that d € J, 
and hence c < « < d, so x € I. Thus, J is either [a, oo) or (a,0o) depending on whether or 
notaé I. 

By similar reasoning, if J is bounded above but not below then let a = sup(J). If a <a 
then x is not an upper bound for J, which means that for some point d > «x it is true that 
d€I. Since I is not bounded below there is some c < x with c € J, and hence c < x < d, 
so « € I. Thus, I is either (—oo, a) or (—co, a] depending on whether or not a € I. 

If J is bounded above and below, let a = inf(J) and let 6 = sup(J). For any point 
x € (a,b) we can find c € (a,x) and d € (a,b) so x € I which means that (a,b) C I. Thus, 
I = |a, }] or [a,b) or (a, 6] or (a,b). 


Chapter 2 


Induction 


The natural numbers are denoted by N and are the counting numbers {1, 2,3,...} but using 
that as a definition before establishing induction is potentially problematic because it is not 
clear that such listings with three dots at the end are a valid way to create a set without 
induction. Because we are focused on the main results of advanced calculus in this text we 
will assume results about the natural numbers in this section without their proofs. 

The Supplementary Materials chapter contains a development of the natural numbers 
without assuming any additional axioms. If the reader has time, it is recommended that 
the proofs of these properties be studied from the Supplementary Materials chapter. 

Essentially, if this text is used for a math education major course and the goal is to 
finish a fair bit of the integration chapter in one semester then it is probably better to 
proceed as outlined below. If this text is used as a math major text and is the first proof 
course a student encounters with the goal of finishing Chapter 5 with a brief introduction to 
integration then foundations are probably more important and integration will be covered 
in a later course, so it might be wise to prove the theorems in the development in the 
supplementary materials. 


Properties of N (and the theorem in the Supplementary Materials chapter where a 
justification can be found for each): 

1 is the least element of N (by Theorem 7.1) 

N is well-ordered, meaning that every non-empty subset of N has a least element. (by 
Theorem 7.3) 

For every n € N,n+16€N. (by Theorem 7.1) 

If n € N there are no natural numbers between n and n+1 or between n and n—1. (by 
Theorem 7.3) 

Ifnm>1andn€N then n—1€N (by Theorem 7.1) 


Theorem 2.1. Principle of Mathematical Induction. Let P(n) be a statement so that for 
each n EN the following two statements are true: 

(a) P(A) ts true 

and 

(b) If P(k) is true for some k € N then P(k +1) is true 


Then it follows that P(n) is true for alln EN. 


31 


32 CHAPTER 2. INDUCTION 


Proof. Let S = {n € N|P(n) is false }. Suppose S 4 0. Then since N is well-ordered, there 
is a first element m € S. We know P(1) is true and since 1 #4 m and 1 is the least natural 
number, it must follow that m > 1. Thus, we know that m-—1 © N and since m is the least 
element of S it follow that m—1¢S, so P(m-—1) is true. But then, by (b) we know that 
P(m—1+1) is true, so P(m) is a true statement, contradicting m € S. Hence, it follows 
that S is empty, so P(n) is true for all natural numbers n. 


Note that the assumption that P(k) is true in induction arguments is often referred to 
as the induction hypothesis.” The principle of mathematical induction feels like it should 
be true because if a first statement is true and it is true that whenever a given statement 
is true then the next statement it true then the second statement is true because the first 
is true, and the third is true because the second is true, and the fourth is true because the 
third is true and so on, so the statement is true for all natural numbers. Unfortunately, the 
”and so on” part of that discussion is the assumption of mathematical induction. Thus, 
while that description is not mathematically rigorous, it is helpful to assist some people 
with their intuition. Some people studying induction for the first time think ”doesn’t the 
assumption that the statement is true for a given k which is made in most induction proofs 
assume what we are trying to prove?” It does not, because we are only making such an 
assumption for an arbitrary & and using that to prove the statement would then be true for 
the k + 1st statement. This then shows that if the statement is true for a given & then it 
is true for k +1. Most mathematical theorems are phrased like this. You prove something 
is true assuming certain hypotheses. No one is asserting the hypotheses are true, only 
that if they are true then a conclusion follows. When we make the induction hypothesis 
assumption we are not stating that we think P(k) is true. We are demonstrating that if it 
is given that P(k) is true then that would imply that P(k + 1) is true as well. 


Here is an example of a proof using induction. Since we may use the result later, we 
will label it as a theorem rather than an example. 


1 
Theorem 2.2. For all natural numbers n, the sum 1+2+..+n= a 
: . ; 1(1+1). . 
Proof. Proceeding by induction, when n = 1 we know that 1 = ———— is true. Assuming 
k(k+1 
the statement is true when n = k € N we have that 1+2+..+k= Ganeo sol+2+ 


k(k + 1) _ P+ 38k+2  (K+1(K +2) 


wtk+(k+1)= +(k+1) 5 5 
result for all natural numbers n by induction. 


. This establishes the 


Theorem 2.3. Letm,neéEN. 
(a)m+neEN 
(b) mn EN. 
(c) Ifn>m thenn=m+k for some k €N. In other words, n—meN. 


Proof. Fix m EN. 

(a) Let P(n) be the statement that m+n €N. We know m+1 € N since N is inductive. 
Assume that m+k € N for some k € N. Then m+k+1 € N because N is inductive. Hence, 
m+n €N for all n €N. Since the choice of m was arbitrary, m+n € N for all m,n € N 


33 


(b) Let Q(n) be the statement that mn € N. We know that m(1) € N. Assume that 
mk € N. Then m(k+1) = mk+m €N since we know that mk and m are natural numbers 
and we have shown that the sum of natural numbers is a natural number. It follows that 
mn € N for all n € N. Since the choice of m was arbitrary, it follows that mn € N for all 
mneEeN 

(c) Let S = {n € N|n > m and n—m ¢ N}. Suppose that S 4. Then S has a least 
element t since N is well-ordered. We know that ¢t is not m+1 sincem+1—m=1€EN, so 
t is at least m+ 2. Hence, t— 1 > m and, since t — 1 ¢ S, we know that (t —1) -meEN, 
from which it follows that t- 1—m+1=t—m€N, contradicting the assumption that 
t € S. The result follows. 


Definition 13 


The integers Z are the set NU —NU {0} = {...,-3, -2,—-1,0,1,2,3,...}. The 


rational numbers are the set Q = ips € R|p € Zand gE N}-. 
q 


Theorem 2.4. (a) For any integer k, there are no integers between k andk+1 or between 
kandk—1. 

(b) If m,n are integers thenm +n and mn are integers. 

(c) Letk € Z. Then if j is an integer so that j > k then j =k+™m for some natural 
number m. 


Proof. (a) The only positive integers are natural numbers and for every negative integer 
k we know that —k € N by definition. Thus, 1 is the least positive integer and -1 is the 
greatest negative integer, so there are no integers between -1 and 0 or between 0 and 1. If 
k € N then by Theorem 7.3, it follows that there are no integers between & and k + 1 or 
between k and k—1. If —k € N then again by Theorem 7.3 it follows that there are no 
integers between —k and —(k +1) and no integers between —k and —(k — 1), which means 
that there are no integers between k and k + 1 or between k and k — 1. 

(b) If m = 0 then nm = 0 and m+n =n. If m,n € N then the result follows 
by Theorem 2.3. If m,n € —N then mn = (—m)(—n) € N by the same theorem, and 
m+n=—(—m+-—n). Since —m,—n € N we know that -m—néEN, som+ne—-NcCZ. 
If m € N and n € —N then m(—n) € N, so mn € —N. Also, m+n = m-— (—n) EN if 
m > —n by the third part of Theorem 2.3. If m = —n then m+n=0€ Z. If m< —n 
then —n — m € N which means that m+n €-—-NCZ. 

(c) By (b) we know that 7 —k € Z and since j > k we know that j —k > 0 which means 
that 7 — k =m €N since the only positive integers are natural numbers. 


Theorem 2.5. Generalized and Strong Induction. Let j € Z. For each n € Z, let P(n) be 
a statement so that (a) P(j) is true, and one of the following is true: 


34 CHAPTER 2. INDUCTION 


(b) whenever P(k) is true for an integer k > j, it follows that P(k +1) is also true, 

or 

(b)’ whenever P(i) is true for all integers i such that j <i <k, it follows that P(k +1) 
is also true. 


Then it follows that P(n) is true for all integers n > j. 


Proof. First, assume (a) and (b)/ are true. Let Q(n) be the statement that the statements 
P(j), PUG +1),...,P(j+n-1) are true. Then Q(1) is that P(j) is true, which follows from 
(a). Assuming that Q(k) is true we know that P(j), P(j + 1),...,P(g + k — 1) is true, so 
P(j +k) is also true by (b)’, which means that Q(k +1) is true. Hence, Q(n) is true for all 
natural numbers n by induction, and so P() is true for all integers 1 > 7 by Theorem 2.4 
part (c). 

Assume (a) and (b). Since (b) implies (b)’ the result follows from the preceding argument. 


”Generalized” induction is where we start the induction at an integer other than one 
(we show (a) and (b)). ”Strong” induction is where we assume the result has been shown 
for all 7] <n < k instead of just for n = k in order to prove the result is true when n = k+1 
(we show (a) and (b)’). Here is an example using strong induction. The proof could be done 
just using generalized induction in this case, but we will use strong induction to illustrate 
how it can be used. 


Example 2.1. Letn € N and letn > 8. Then prove that n = 31+5j for some non-negative 
integers i and j. 


Solution. If n = 8 we see that 3(1) +5(1) = 8, so the theorem is true. Also, 3(3) +5(0) = 9, 
and 3(0)+2(5) = 10. Assume the result is true for all natural numbers m so that 8 < m < k, 
where k > 10. Then we know that there are non-negative integers 7, 7 so that k—2 = 31+5) 
since k — 2 > 8, and thus k + 1 = 3(4+1)+5j. The result follows by strong induction. 


Definition 14 


For any x € R we define x! = x and for each n EN, ifn > 1 and 2”! is known 
then the nth power x” of x is defined to be to be a(x”~'). If « 4 0 then we define 


n 


ll 
ey =landr”= a for each n € N. If x > O we define an nth root ae of x to be 
x 


B 1 
a positive real number whose nth power is x. We use the notation x7 = (x7)? for 
q € N and p € Z when these exist. 


Uniqueness and existence of nth roots are established in exercises. 
Though 0° is undefined, it is a common convention (which by default we use unless 
otherwise specified in this text) to use g(x) = (f(x))° to denote the function g(x) = 1 (so 


39 


we define the function g(a) to be one even when f(x) = 0 in this notation). This makes 
many theorem statements less messy, though it is somewhat confusing because it means 
that, most of the time, when we write 2° what we mean is x° if « 40 and 1 if x =0. 


Note that we have defined positive integer powers of x recursively, which is a notion 
described in more detail in the Induction portion of the Supplementary Materials section. 


Theorem 2.6. N is not bounded above. 


Proof. Suppose N is bounded above. Then N has a least upper bound u. Hence, there is 
some n € N so that n > u—1 by the Approximation Property, which means that n+1 > u, 
which is a contradiction to u being a upper bound for N sincen+1€N. 


Induction can be used to prove many interesting results. One example is the Binomial 
Theorem, which is our next objective. 


Definition 15 


ifk <mnandn and k are non-negative integers 


The notation C means 


n! 
(n — k)!k! 
(where n! = n(n — 1)(n — 2)...(1) ifn > 1 and 0! = 1). 


Theorem 2.7. For anyn,k € N so thatn > k it is true that (, i :) + @ = (" - ‘) 


n! n! ni(n-—k+1+k) _ (n+)! 


Proof. (n — klk! + (n—k+1\k—D)! = (n—k+1)(n—k)!k(k — 1)! = (n+1—k)!k! 


nm 
Theorem 2.8. The Binomial Theorem. For every positive integer n, (x+y)” = (") ge 


1 
Proof. We proceed by induction on n. If n = 1 we note that (2 + y) = ( av" + 


k 


Qua is true. Assume that (2 + y)* = ) (Quan Multiplying by (2 + y) on both 
i 
i=0 


k+1 k+1 
sides of this equation and rearranging terms yields ( i jay — 6 . goyhtt 4 


k k ee 
»(, ) + Gis ‘on the right side of the equation, which, by Theorem 2.7, is 


t 


36 CHAPTER 2. INDUCTION 


One useful consequence of the fact that the natural numbers are not bounded above is 
that the rational numbers are dense in the real numbers, meaning that there is a rational 
number in each non-empty open interval. We will present two proofs. 


Theorem 2.9. Let a,b be real numbers so thata < b. Then there is a rational number 
between a and b. 


Proof. If a < 0 < 6 then since 0 is rational, we are finished. Next, assume that 0 <a < 6. 


1 
Since N is not bounded above by Theorem 2.6, we can find a positive integer g > bog? © 
—a 
1 
that — < b—a. Also since N is not bounded above, we can find a positive integer m > qa 
q 


m a 

so that — >a. Let S = {i © N|- > a}. Since S is a non-empty subset of N it follows that 
q qd 

p-1l 


S has a least element p. Note that if p = 1 then = 0 <a, and otherwise p—1€N, so 


qd 
=] —1 
—— <asince p is the least element of S. Hence, E 


1 
+ = <a+b—a=b soa < © <b. 
q q q 
Finally, assume that a < b < 0. Then by the preceding argument we can find a rational 


number r so that —b < r < —a, soa < —r < b. Hence, in each case the result follows. 


Alternate proof: 


1 
Proof. Since N is not bounded above, we can find gq € N so that g > P , so qgb—qa> 1. 

—a 
By Exercise 2.6, we know that there is a first integer m so that m > qb. Hence, m—1 is an 


integer so that m—1 < qb. Since m—1 > qb—1 > qa it follows that ga < m—1 < qb, and 


—1 
<b. 


m 
therefore a < 


We sometimes want to talk about divisibility and prime numbers, so we will define those 
terms here. Some exercises that are good induction examples use divisibility. 


Definition 16 


We say that an integer m is divisible by an integer k if z is an integer (also 


written as m = gk for some integer q), and we refer to k as a divisor of the integer m 
and say that k divides m or m is divisible by k. We use the notation k|m to denote 
”k divides m.” 


_ £ 
We also use some of the same terms for any ratio — of real numbers x and y and refer 


to x as the numerator or dividend and y as the denominator or divisor, but when we are 


37 


referring to a divisor of an integer we normally mean the definition above (an integer which 
divides into the numerator evenly). 


Definition 17 


We say that a natural number p > 1 is prime if the only natural numbers that 
divide p are 1 and p. A positive integer is composite if it is not prime (meaning 
it is the product of two natural numbers, neither of which are one). We also refer 


to any non-zero integer as being composite if its absolute value is composite. The 
greatest common divisor of non-zero integers a, b is the largest natural number that 
is a divisor of both a and b. We say an integer is even if it is divisible by two, and 
odd if it is not. 


We will not be spending a lot of time on divisibility, but the Fundamental Theorem of 
Arithmetic is a good theorem using strong induction for its proof and is helpful in advanced 
calculus. So, a proof of this result is also found in the Supplementary Materials section for 
the single variable development. 


38 CHAPTER 2. INDUCTION 


Exercises: 
1)(2 1 
Exercise 2.1. For each natural number n, 17 + 27+3?+...4+n? = ia an ) 
2 2 
1 
Exercise 2.2. For each natural number n, 1° + 2? +33 +...+n? = aaa 


4 


Exercise 2.3. 5” — 1 is divisible by 4 for all natural numbers n. 


Exercise 2.4. Letn € Z. Then n odd if and only if n+1 is even and n is even if and only 
if n+ 1 is odd. 


Exercise 2.5. Define the Fibonacci sequence to be the function f : N— R by the following 
recursive definition. We use the notation f(n) = an and define a, = az = 1 and define 

Ie soa = J Bes 
eR - Go = on for 
every positive integer n. (Assume, for now, that we know there is a positive number whose 
square is five - a fact that will be proven later). 


Qn = An-1 + Gn—2 for integers n > 2. Prove that 


Exercise 2.6. Let S be a non-empty set of integers which is bounded below. Then S' has a 
first element. 


Exercise 2.7. Let m,n © Z. If m is even then mn is even. If m and n are both odd then 
mn is odd. 


Exercise 2.8. Let m © Z. Then, for every positive integer n, it is true that m is even if 
and only if m” is even. 


Exercise 2.9. Let S be a non-empty set of integers which is bounded above. Then S' has a 
last element. 


Exercise 2.10. For any natural number n it is true that n°? + 3n is even. 
Exercise 2.11. For any natural number n it is true that n° + 2n is divisible by three. 
Exercise 2.12. Show n! > 2” for any natural number n > 4. 


Exercise 2.13. Let0<a<_b. Then, for every positive integer n, show that 0 < a” <b”. 
1 1 
If there are positive numbers am and b= whose nth powers are a and b respectively, then 
1 1 
O<an <br, 


39 


Hints: 


Hint to Exercise 2.1. For each n € N it is true that 17+27+37+...+n? = 


Use induction. 


Hint to Exercise 2.2. For each natural number n, 1° + 2? + 3° +...+n? = 


Use induction. 


Hint to Exercise 2.3. 5” —1 is divisible by 4 for all natural numbers n. 


Use induction. Remember that for a natural number to be divisible by 4 is the same as 
that number being equal to 4m for some integer m. 


Hint to Exercise 2.4. Letn € Z. Then n odd if and only if n+ 1 is even and n is even 
if and only if n +1 is odd. 


Divide an even number plus one by two. Explain why an integer plus one half cannot 
be an integer. 


Hint to Exercise 2.5. Define the Fibonacci sequence to be the function f : N— R by the 
following recursive definition. We use the notation f(n) = an and define a, = ag = 1 and 


1+V5,, l-Vd\_,_ 
yO (SO = an 


for every positive integer n. (Assume, for now, that we know there is a positive number 
whose square is five - a fact that will be proven later). 


1 
define Qn = Gn—1 + Gn—2 for integers n > 2. Prove that Wes 


Use strong induction. 


Hint to Exercise 2.6. Let S be a non-empty set of integers which is bounded below. Then 
S has a first element. 


One approach would be to look at S +k, where k is a large enough natural number 
so that all elements of S +k are positive (once you have explained why there is a natural 
number & so that all elements of S +k are positive). 


Hint to Exercise 2.7. Let m,n € Z. If m is even then mn is even. If m and n are both 
odd then mn is odd. 


Use the definition of being even and odd and Exercise 2.4. 


Hint to Exercise 2.8. Let m«€ Z. Then, for every positive integer n, it is true that m is 
even if and only if m” is even. 


40 CHAPTER 2. INDUCTION 


Use Exercise 2.7 and induction. 


Hint to Exercise 2.9. Let S be a non-empty set of integers which is bounded above. Then 
S has a last element. 


Consider the set of integers which are upper bounds for S, and use well ordering. 


Hint to Exercise 2.10. For any natural number n it is true that n° + 3n is even. 


Use induction or prove that the sum of odd integers is even. 


Hint to Exercise 2.11. For any natural number n it is true that n° +2n is divisible by 
three. 


Use induction to show that n® + 2n is always three times a natural number. 


Hint to Exercise 2.12. Show n! > 2” for any natural number n > 4. 


Use generalized induction. 


Hint to Exercise 2.13. Let 0 < a < b. Then, for every positive integer n, show that 
1 1 
0 <a” < b”. If there are positive numbers an and bx whose nth powers are a and b 
1 1 
respectively, then0<an <br. 


Use induction to show 0 < a” < b”. Then use that result to prove 0 < Ge bn (try 
contradiction to see the connection). 


Al 


Solutions: 


Solution to Exercise 2.1. For each n € N it is true that 17 + 27 +3? +..4+n? = 
n(n + 1)(2n + 1) 


6 


1)(2)(3 
Proof. Proceed by induction. If n = 1 then 1? = — Assume that 17 + 2? + 3? + 


k(k + 1)(2k +1) 
2 ts 
+h = ; 


k(k + 1)(2k + 


for some k € N. Then 17+ 274 374+...4 4? 4+ (k4+1)? = 
34 24 | 24 | 3 2 
LD) eae 4-1 = ZEA SR + e+ 6h? + 12h +6 _ 2h + 9h? + 13h +6 


6 6 6 
= irae De zm The result follows by induction. 


6 
2 1 2 
Solution to Exercise 2.2. For each natural number n, 1° +2°+33+...+n3 = ua 
3 17(1+1)? wee a 
Proof. Ifn = 1 we have 1° = — {a Assume that for a positive integer k it is true that 
k?(k +1)? 


2(k 1)2 
. Then 424384. 4644 (e+e =e 


kA + 2k +k? + 4k3 +12k? 412k +4 | 
; = 


. The result follows by induction. 


194294394 .4h = 
A+ 23 +? 


(k+1)8 ji +k? 43k? +3k+1= 
k* + 6k3 + 13k24+12k+4  (k+1)2(k+2)? 
4 7 4 


Solution to Exercise 2.3. 5” — 1 is divisible by 4 for all natural numbers n. 


Proof. Proceeding by induction, when n = 1 we have 5! — 1 = 4 is divisible by four since 
1 = 1. Assume that for some n = k € N we know that 5* — 1 = 4m for some natural 


number m. Then it follows that 5(5” — 1) = 20m, so 5**+ — 1 = 20m +4 = 4(5m +1). 
Thus, 5**! — 1 is divisible by four, so the statement holds for all natural numbers n by 
induction. 


Solution to Exercise 2.4. Let n € Z. Then n odd if and only ifn+1 is even and n is 
even if and only if n+ 1 is odd. 


Proof. We proceed by induction for natural numbers n. Note that n = 1 is not even since 


1 
3 < 1 which means it cannot be a positive integer (and 3 is positive, so it is not an integer), 


whereas 1 + 1 is even since : = 1 and 2 +1 is odd since : =1+ e which is between one 

and two and so is not an integer. Using strong induction we assume that the statement has 

been established for all n < k © N where k > 2. If +1 is even then ie — a 5 : + ; 
k+1 1 k+1 


1 
€ N. However, since 5 < 1 we know that 5 5 < 5 + 1 and we also 


where 


42 CHAPTER 2. INDUCTION 


k+1 k+1 2 
know there are no integers between as and zn +1, so pone is not an integer, which 
means k + 2 is odd. If k + 1 is odd then from the induction hypothesis we know that k is 
k+2 


even, which means that 5 +1 is the sum of two natural numbers which is a natural 


number, so k + 2 is even. 

Having shown this is true for natural numbers, we first note that for any negative integer 
n, it is true that if —n = 2m then n = —2m, so n is even if and only if —n is even. If m = 0 
then m is even and m+ 1 is odd, and if m = —1 then m is odd and m+ 1 = 0 is even. If 
m is an integer less than -1 then —m,(—m-+1),(—m-— 1) € N and so we know that —m is 
even if and only if —m+1 and —m— 1 are odd. We also know that m is even if and only 
if —m is even, and —m+ 1 and —m — 1 are odd if and only if m—1 and m+1 are odd. 
Thus, m is even if and only if m+ 1 and m— 1 are odd. 


Solution to Exercise 2.5. Define the Fibonacci sequence to be the function f : N— R by 
the following recursive definition. We use the notation f(n) = an and define ay = ag = 1 


EME ps (SNP 

2 2 ue 
for every positive integer n. (Assume, for now, that we know there is a positive number 
whose square is five - a fact that will be proven later). 

=alt- say re Ay =a aes 
positive integer 7 < n. The statement is immediate if n = 1 or n = 2. Assume the 
statement is true for all 1 < 7 < k for some k > 2. Then it follows that ag44 = ap + @p_1 = 


1 1+ V5, CoB een j= 1 1+ Vb4-1 1+ V5 | 1) 


5 fF} = IC 5 
1-Vv5 (1+ V5)9 
2 


2 V5 2 
ee ree ea _145+2vV5 _3+v5_1+v5,,, 
pT PRR VE Beh _ taa/h 


1 
and define Gn = Qn—1+@n_2 for integers n > 2. Prove that 5 [( 


Proof. We proceed by strong induction to show 


nd 


yacht v5 | 


+1. Thus, ag41 = 


[( 


2 2 
Lo isia/5 
2 4 2 2 


et 2 
yp - ASE ay = Fig Se - Sy 
follows. 


2 


as desired. The result 


( 


2 


Solution to Exercise 2.6. Let S be a non-empty set of integers which is bounded below. 
Then S has a first element. 


Proof. Let m be the greatest lower bound of S. By the Approximation Property there is 
some integer n € S so that m<n<m+1. Suppose n >m. Then n—1<m<_n and 
there are no integers in (n — 1,n) which means that S contains no points less than m or in 
(n — 1,n) and hence no points less than n. Thus, n is a lower bound for S, contradicting 
the fact that m is the greatest lower bound of S, We conclude that n = m and therefore 
mes. 


43 


Solution to Exercise 2.7. Let m,n € Z. If m is even then mn is even. If m and n are 
both odd then mn is odd. 


Proof. If j,k are integers and j is even then 7 = 2s and so jk = 2sk which means that 7k 
is even. 

If 7,& are odd integers then 7 = 2s+1 and k = 2t+ 1 for some integers s,t by Exercise 
2.4, which means that jk = 4st+2(s+t)+1, which is odd since 4st+ 2(s+t) = 2(2st+s+t) 
is even (by Exercise 2.4). 


Solution to Exercise 2.8. Let m€ Z. Then, for every positive integer n, it is true that 
m is even if and only ifm” is even. 


Proof. If m is even then m! is even. Assuming m* is even for some natural number k, we 


have m*+! = m'm = m**! is a product of two even integers and is even by Exercise 2.7. If 
m is odd then m! = m is odd. Assuming m* is odd for some natural number k we have that 
m*m = m*+! is a product of odd integers and is odd by Exercise 2.7. Thus, by induction 


it follows that m” is even for all n € N if m is even, and m” is odd for all n € N if m is 
odd. 


Solution to Exercise 2.9. Let S be a non-empty set of integers which is bounded above. 
Then S has a last element. 


Proof. Let u = sup($). By the approximation property we can find an element s € S so 
that u—1<s <u. But there are no integers in (s,s +1), which means that every element 
of S is less than or equal to s, and therefore s = u, which is the largest element of S. 


Solution to Exercise 2.10. For any natural number n it is true that n° + 3n is even. 


Proof. Proceeding by induction, 1° + 3(1) = 4 = 2(2) is even. Assume that k° + 3k = 2m 
for some natural number m. Then (k + 1)? + 3(k +1) = 24+ 3h? 4+ 3k 414 3k 43 
— kh? 4+ 3k 4+ (3k? 4+ 3k +4) =k? 4 3k 4+3k(kK +1) +4 = 2m4 3k(k 41) 44. By Exercise 
2.4, we know that either k or k + 1 is even. Thus, there is some natural number j so that 
either k = 27 or k +1 = 27. Hence, either (k + 1)? + 3(k +1) = 2(m+3j(k +1) + 2) or 
(k +1)? +3(k +1) = 2(m+3jk + 2), which means that (k + 1)? + 3(k + 1) is even. The 
result follows by induction. 


Solution to Exercise 2.11. For any natural number n it is true that n° + 2n is divisible 
by three. 


Proof. We proceed by induction. We know that 13+ 2(1) = 3 is divisible by 3. Assume that 
k 42k = 3m for some natural number m. Then (k+1)?+2(k+1) = k?+3k?+3k4+142k4+2 
= (k? 4+ 2k) 4+3(k? +k +1) = 3(m4+k? +k +1), which is divisible by 3. The result follows 
by induction. 


44 CHAPTER 2. INDUCTION 


Solution to Exercise 2.12. Show n! > 2” for any natural number n > 4. 


Proof. We proceed by generalized induction. We know 4! > 2% is true. Assume the 
statement k! > 2" is true for an integer k > 4. Then (k +1)! = (k+1)(k!) > 5(k!) > 
2(k!) > 2(2*) = 2**4 by the induction hypothesis. Thus, n! > 2” for any natural number 
n > 4 by generalized induction. 


Solution to Exercise 2.13. Let 0 < a < b. Then, for every positive integer n, show 
1 1 
that 0 < a” < b”. If there are positive numbers ax and bx whose nth powers are a and b 
1 1 
respectively, then0<an <br. 


Proof. We proceed to show 0 < a” < 6b” by induction. If n = 1 then 0 < a < b is given 
to be true. Assume this is true when n = k. Then 0 < a® < b*. Since a > 0 we have 
(0)(a) < a**+, Since a < b we know that a(a*) < b(a*). Since a* < b* we know that 
b(a*) < b(b"). Combining these inequalities gives 0 < at! < b**!. By induction the result 
holds for all positive integers n. 

We are given that these nth roots are positive. By the preceding part of the argument, 
if be < an then (bx)” S (an)” so b < a, a contradiction. Hence, it must follow that 


a 1 
O<an <br. 


Chapter 3 


Sequences 


Definition 18 


A sequence is a function whose domain is N. If f : N — R is a sequence then we 
use the notation {z,,} to denote the function f, where f(n) = 2. We refer to n as 
the index of the sequence member z,,. Depending on context we may also use {z,,} 
to refer to the range of f. If g : N > N is an increasing function then we say that the 
sequence {%g(n)} is a subsequence of {x}, and normally write g(i) = n;. Essentially, 
in a subsequence, infinitely many terms from the original sequence are listed in the 
same relative order. 


We say that {z,,} converges to c, often written {x,} — c (also written lim x, = c) 
N—-Co 


if for every € > 0 there is some k € N so that if n > k then |r, —c| <e. 

We say that a statement P(x) about x is true for x € D sufficiently large (or if 
x € D is sufficiently large) if there is a number M so that if x > M and x € D then 
P(x) is true. We say that a statement P(x) is true for x € D sufficiently small if 
there is a number 6 so that if x < 6 then P(x) is true. We say statement P(x) is 
true if x € D is sufficiently close to c if there is a 6 > 0 so that P(x) is true for all 
x € D so that |x —c| < M. If no set D is specified it is understood that D = R. 


If D is understood we may not mention what D is. For instance, when talking about 
sequence indices, those indices must be real numbers. For example, rather than say x, > 2 
for n € N sufficiently large we might just say 2, > 2 for n sufficiently large (since it is 
known that there is no such thing as x, for n ¢ N). Another way of stating the definition 
of convergence that sounds more intuitive for some people is thus: 

We say {x,} — c if for every positive distance e« from c, every sequence member 2, is 
distance less than € from c if its index n is sufficiently large. 


In the theorems that follow, sequences listed are assumed to be sequences of real numbers 
unless otherwise stated. Likewise, if we indicate that a sequence converges to something 
then it is understood the thing converged to is a real number. 


When establishing sequence convergence for a sequence {z,,} to a point p, one normally 
starts by declaring an arbitrary « > O and then shows the existence of a corresponding 


k € N so that sequence members of index higher than k (sequence members listed later in 


45 


46 CHAPTER 3. SEQUENCES 


the sequence order than the kth term) are within a distance € of the point p. The integer 
k is different for different « values, so k can be thought of as a function of «. When first 
encountering sequence convergence, people sometimes think that they just have to find a 
k so that x, is within some particular distance of p, but that isn’t what is needed. It has 
to be shown that no matter what distance « from p we start with, there is always some 
corresponding k = k(e) so that all x, sequence members with n > k are within distance ¢€ 
of the point p. We are really showing the existence of a function k(e€) : (0,co) — N in this 
manner because every € > 0 must have an associated k. For brevity in notation, we don’t 
normally write k(e) to refer to k as a function of € and instead simply show the existence 
of a k corresponding to an arbitrary € > 0 satisfying the definition of convergence. 


1 
Theorem 3.1. {—}— 0. 
n 


1 1 
Proof. Let € > 0. Since N is not bounded above, we can find k € N so that k > = so k <€. 


1 1 1 
Hence, ifn > k then 0 < — < — <e€,so|—— 0] <. 
nk n 


Theorem 3.2. Let 7 EN. If x, =c for eachn €N so thatn > j then {x} ¢. 


Proof. Let € > 0. Since |r, — c| = |c —c| = 0 <e€ for all n > J, it follows that {z,}— c. 


Theorem 3.3. The sequence {x,} — c if and only if {xy — c} > 0. 


Proof. We know that {z,,} — c if and only if for every « > 0 there is some N € N so that 
ifn > N then |x, — c| < €, which is true if and only if for every « > 0 there is some N € N 
so that ifn > N then |(xp — c) — 0| < €, which is true if and only if {x, — c} > 0. 


Theorem 3.4. If {x,,} is bounded and {y,} — 0 then {anyn}— 0. 


Proof. Choose M > 0 so that —M < 2, < M for alln € N. Let € > 0. Choose N € N so 


that ifn > N then |y, — 0| < a Then |2nYn — 0| < M— = €, 80 {ZnYn} > 0. 


The proof of the next theorem uses the fact that a finite set always has a first and a last 
point, which is not really addressed until the end of this chapter in Theorem 3.30. Though 
we normally prefer to put theorems using a result after the result has been proven, in this 
case we put the proof later with other proofs about cardinality in order to avoid taking 
the direction of the subject matter on a tangent in the middle of developing sequences. 
However, the proof of Theorem 3.30 is self-contained (so the argument is not circular) and 
nothing is lost by reading that proof first if preferred. Most of the time we do not quote 
Theorem 3.30 explicitly (it is normal in texts to assume that this is understood). 


AT 


Theorem 3.5. If {x,} — p then {x,,} is bounded. 


Proof. Choose k € N so that if n > k then |r, — p| < 1, sop—1< a, < p+l1. Let 
M = max(21,%2,...,¢%,p +1) and m = min(xj,%2,...,2%,p — 1). Then if n < k we know 
that m < min(x1,22,...,0%) < &p < max(#4, ®2,...,¢,) < M andifn >kthenm<p—-l1< 
In <p+1< M. Hence, for all n € N it follows that m <a, < M. 


Theorem 3.6. The Squeeze Theorem. Let {x,} > c and {zn} > c, and let {yn} be a 
sequence so that for some positive integer 7, ifn > 7 then ay < Yn < Zn OT Zn < Yn < Ln- 
Then {yn} > ¢. 


Proof. Let € > 0. Choose k; € N so that if n > k, then |x, — c| < €, and choose kp € N so 
that ifn > ko then |z, —c| < «. Ifn > k = max{ky, ko, 7} then c—€ < In < Yn < Zn < CHE 
Or C—€ < 2n < Yn S In < C+, 80 |Yn — cl <e. 


The Squeeze Theorem can be used to prove the convergence of many sequences and 
later we can use it to prove limits of functions which are not sequences. Note that in many 
texts a slightly weaker theorem is referred to as the Squeeze Theorem, but the version we 
have proven is useful in more instances. 


1 


eh ay 
ae 


Example 3.1. Prove { 


Solution. First, note that since n > 1 and 0 < 1 < 2 we have n < n? < Qn? < 2n?+ 1, 


1 1 1 
so 0 < —.—— < —. Since we have shown that {—} — 0 and {0} — 0 it follows that 
2Qn27+1 nn n 
{ 


} — 0 by the Squeeze Theorem. 


Qn? +1 


Theorem 3.7. The Comparison Theorem. Let {x} — c and {y,}— d, so that, for some 
ZEN, tn < Yn for alln > 7. Thenc<d. 


—d 
Proof. Suppose d < c. We can find k; € N so that if nm > k, then |x, — cl < a 
—d —d 
meaning that c — = <Un<cet+ — We can also find ky € N so that if n > ko then 
—d —d —d 
lyn —d| < a meaning that d— a <Yn <d+ a Thus, ifn > k = max{k1, ko, 7} 


Ge ees : : . : 
then yn < =r < 2%», which is impossible since we are given that xr, < yn since n > j. 


1 
Theorem 3.8. Let {an} —>a#0 and let an £0 for eachn EN. Then lena is bounded. 


48 CHAPTER 3. SEQUENCES 


Proof. We can choose k € N so that if n > k then jan — al < ay and by the Triangle 
1 2 
Inequality, |a,| = |a — (a — an)| > |a| — jan — al > a Hence, if n > k then jaa = jal It 
1 21 1 1 eg Syne 
follows that —— < max{—, ; wha } for all n € N, so {—} is bounded. 
lan la|° Jai] Ja2|" Jax—a| Gn 


Theorem 3.9. Let {a,}— a and {b,} — b. Then: 


(a) {an +bn} aatb 
(0) {anbn} — ab 
(c) Ifb 40 and for eachn EN, bp, £0, then 


(d) {cay + db,} + ca+ db 


Gm a 
bn, b- 


Proof. (a) Let € > 0. Pick ky € N so that ifn > k, then |a,—a| < 5 and pick ky € N so that 


ifn > ka then |b, —b| < Ifn > max{ky, ko} then |(an +bp)—(a+b) < Jan—a|+|bp—b] < € 
by the Triangle Inequality. 

(b) Note that {anb, — ab} = {anbp — anb + anb — ab} = {an(by — b) + b(ay — a)}. Since 
by Theorem 3.5 we know {a,} is bounded and {b, — b} — 0 by Theorem 3.3 it follows that 
{an(bn—b)} + 0 by Theorem 3.4. Similarly, {b(a,—a) — 0}. Hence, by (a), {a,b,—ab} > 0 
so {anbn} — ab by Theorem 3.3. 

(0) (G2 = Fp = (OEY = {(5-)(5)(Hlan = a) + a(0 ~ by))}- We know 


ie} is bounded by Theorem 3.8 and {b(a, — a) + a(b— b,)} — 0, so (F - ii — 0 by 


n 


Theorem 3.4, which means that i > ; by Theorem 3.3. 


(d) By (b) and Theorem 3.2 we know that {ca,} — ca and {db,,} — db. By (a), it 
follows that {can + dbn} — ca + db. 


Part (a) of the preceding theorem is the sum rule for sequence limits, part (b) is the 
product rule for sequence limits, and part (c) is the quotient rule for sequence limits. It is 
more convenient for us to have some way to refer to (d) so we will just use ”sum rule” to 
refer to (d), which is a generalization of (a). 


Theorem 3.10. A sequence can converge to at most one number. 


Proof. Let {x,}— s and let {x,} — t. By the Comparison Theorem, since x, < 2», for all 
n €N, we know that s<tandt<ssos=t. 


Theorem 3.11. Let {x,} — ¢ and let {x,} be a subsequence of {rn}. Then {an,}— c. 


49 


Proof. By Exercise 3.12, n; > % for each 7 € N. Let € > 0. We may choose a k € N so that 
ifn > k then |p —c| < €, soifi > k then n; > k, and hence |z,, —c| < «. Thus, {tn,} > ¢. 


Another proof uses the idea of the number of sequence numbers excluded from an 
interval. This may be more intuitive for some readers. 


Proof. By Exercise 3.6, a sequence converges to a point if and only if every open interval 
containing that point excludes at most finitely many sequence terms. Since {x,} — c, for 
any open interval J containing c, there are only finitely many integers n so that x, ¢ I. 
Thus, there are only finitely many integers n; so that rp, ¢ I, so {a%n,}— ©. 


Definition 19 


Let h: D—R bea function. We say that h is increasing if h(a) < h(b) whenever 
a < banda,b€ D. Wesay h is decreasing if h(a) > h(b) whenever a < banda,b € D. 


We say that h is non-decreasing if h(a) < h(b) whenever a < b and a,b € D. We say 
h is non-increasing if h(a) > h(b) whenever a < b and a,b € D. A function which is 
either non-increasing or non-decreasing is referred to as monotone. 


Note that since sequences are functions, they can be increasing, decreasing, non-increasing 
or non-decreasing or monotone as defined above. Thus, a sequence {z,,} is non-decreasing 
if x; < a; whenever i < j and a sequence {z,,} is non-increasing if x; > 2; whenever i < j. 
Likewise, a sequence {x,,} is increasing if x; < x; whenever i < j, and a sequence {x,} 
is decreasing if x; > x; whenever i < j. A sequence is monotone if it is non-increasing or 
non-decreasing. 


Theorem 3.12. Monotone Convergence Theorem. Let {x,,} be a bounded monotone sequence. 
If {x} is non-decreasing then {x,,} converges to its least upper bound. If {x} is non- 
increasing then {x,} converges to its greatest lower bound. 


Proof. First, assume {z,,} is non-decreasing, let u = sup({z,}) and let « > 0. By the 
Approximation Property, there is a k € N so that u—e < a, < u. Since {xp} is non- 
decreasing, it follows that u—€ < xr, < @, < wand hence |x, — u| < € for alln > k. 

Next, assume {2} is non-increasing, let b = inf({z,,}) and let e > 0. By the Approximation 
Property, there is a k € N so that b < a, < b+. Since {x,,} is non-increasing, it follows 
that b < tp, < x, < b+€ and hence |z, — 6| < € for alln > k. 


50 CHAPTER 3. SEQUENCES 


Definition 20 


We say that a point p is a limit point of a set S if for every € > 0 there is a point 


s € § distinct from p so that |p — s| < «. In other words, p is a limit point of S if, 
for every € > 0, (p—e,p+e)NS \ {p} 40. If p € S and p is not a limit point of $ 
then we refer to p as an isolated point of S. 


The usual definition of limit point is that a point p is a limit point of a set S' if every open 
set containing p (or, equivalently, every neighborhood of p) contains a point of S distinct 
from p, but it is more convenient for us to use the definition above (partly since we have 
not yet defined what it means for a set to be open). The fact that these two definitions are 
equivalent is Exercise 3.4. 


Note that a point p € S is an isolated point of S if and only if p is contained in an open 
interval (p—€,p+e) for some € > 0, so that (p—€,p+) contains no point of S other than 


Dp. 


Theorem 3.13. A point p is a limit point of a set A if and only if every open interval 
containing p contains infinitely many points of A. 


Proof. First, assume that every open interval containing p contains infinitely many points 
of A. Let « > 0. Then (p— €,p +) is an open interval containing p, and hence contains 
infinitely many points of A, including points distinct from p. Hence, p is a limit point of A. 

Next, assume that p is a limit point of A. Suppose there is an interval (a,b) containing 
p and only finitely many points of A. Finite sets have first and last points, so if we let b! be 
the first point of A in (a,b) which is greater than p and a’ be the last point of A in (a,b) 
which is less than p, then the interval (a’, b’) contains no points of A distinct from p. Hence, 
if we set € = min(|p —a’|,|p — b’|) then it follows that there are no points of A distinct from 
p which have distance less than € from p, contradicting the definition of limit point. 


Theorem 3.14. A point p is a limit point of a set A if and only if there is a sequence 
{tn} C A\ {p} so that {rp} p. 


Proof. First, assume that p is a limit point of A. Then for each n € N there is some 7, € A 
1 1 

distinct from p so that |p — x,| < —. We know that p— — < 2, < p+ — for each n € N so 
n n n 


{x} — p by the Squeeze Theorem. 

Next, assume there is a sequence {z,,} C A \ {p} so that {x,} — p. Then let € > 0. 
For some k € N, if n > k then |x, — p| < €, so xz is a point of A distinct from p so that 
|x, — p| < €, and thus p is a limit point of A. 


51 


The preceding theorem helps us to understand one of the reasons for calling a limit 
point a limit point. A point p is a limit point of a set S if p is the limit of a sequence in 


S\ {p}. 


Example 3.2. (a) What are the limit points of (0, 1]? 
1 
(b) What are the limit points of {—}? 
n 
(c) What are the limit points of Z? 
(a) [0,1] since every open interval about any point p in this interval intersects (0, 1] at 
points other than p. 
1 
(b) The point {0} is the only limit point of this sequence. Since {—} — 0 we know that 
n 
1 
0 is a limit point of {—} by Theorem 3.14. Since a sequence can converge to only one point, 
any real number p other than zero is contained in an open interval that contains at most 


1 
finitely many members of {—} by Exercise 3.6, which means that p is not a limit point of 
n 


1 
{—} by Theorem 3.13. 
n 
(c) The set Z has no limit points. Let p € R. If there is an integer k so that k € 


1 1 1 il 1 
(p—5,p+ 5) then p—5 <k<pt5,sok 129 5 <k<pts<ktl. Since there are 
no integers between k and k—1 and no integers between k and k+1, (k—1,k+1)NZ = {k}, 


1 1 
which means that (p—=,p+ =) can contain at most one integer and is therefore not a limit 
point of Z by Theorem 3.13. 


Definition 21 


A set U is open if for every p € U there is an « > 0 so that (p—e,p+e) CU. A 
set A is closed if its complement is open. If S C R then we say that U is relatively 
open in S or just open in S if there is an open set V so that VMS =U. Likewise, we 
say that a set A is closed in S or relatively closed in S if there is a closed set H in 
so that HONS =A. A set EF that contains (p — €,p +e) for some € > 0 is referred to 


as a neighborhood of p. A set A C R with open and closed sets in A as described is 
called a subspace of R whose topology is induced by R under the subspace topology. 
The interior of set S is denoted by S° = {x € S|(z —¢«,x + €) C S for some e€ > O}. 
The closure of a set S is denoted by S and is the set consisting of S and all limit 
points of S. The boundary of S is denoted by 0(S) and is the set of points p so that 
for every € > 0, the interval (p — €,p + €) contains a point in S and a point which is 
not in S. 


Upon first encountering terms like ”open” and ”closed” the reasons for the choices of 
words to describe open and closed sets may seem somewhat arbitrary, and it can be helpful 
to have some way to mentally associate these ideas with their names. One can think of a 
set being closed as being closed under sequence convergence, meaning that there is no way 


52 CHAPTER 3. SEQUENCES 


to approach a point external to the set by taking a sequence within the set converging to 
that point. Every point a sequence in a closed set can converge to is a point in the set. 

The idea of a set being open can be thought of in terms of having freedom to vary near 
a point without being blocked in (so movement options are open within a certain distance 
without leaving the set). Each point in an open set is within an open interval contained in 
that set, meaning there are points in both directions (within some distance) from the point 
in question that remain in the set. In this manner, we don’t have to be concerned with 
sequences from outside of the open set converging to points in the set because they can’t 
be chosen arbitrarily closely to a point in the open set. 

Thus, if sequence convergence is thought of as somehow arriving at a destination point 
then for closed sets, sequences inside the set can’t arrive anywhere outside the set, and for 
open sets, sequences outside the set can’t arrive anywhere inside the set. 

Closed sets and open sets are complementary as sets, but they are not logically complementary. 
A set that is not open need not be closed, and a set that is not closed need not be open. 
Some sets are closed and open. A set may be open, closed, both or neither. 


Theorem 3.15. A set A is closed if and only if A contains all of its limit points. 


Proof. Let A be closed and let p ¢ A. Then since R \ A is open, there is an € > 0 such that 
(p—€,p+e) CR\A, so pis not a limit point of A. 

Let A be a set containing all of its limit points and let p ¢ A. Since p is not a limit 
point of A there is an an € > 0 such that (p—e,p+e)NA=9Q, so (p—e,p+e) CR\A, 
and hence R \ A is open. 


Example 3.3. Let A = (0, 1] U(QN (3, 00)). 
(a) Is A open, closed, both or neither? 


(b) Is (0, 5! open, closed or neither in A? 


(c) Is (1,27) 1 A open, closed or neither in A? Assume we know that x and 27 are 
irrational. 


Solution. (a) A is neither open nor closed. The point 1 is not contained in an open interval 
which is contained in A (likewise, none of the rational numbers in (3, 00) are in the interior 
of A), so A is not open. The point 0 is a limit point of A and is not contained in A (the 
same can be said of any of the irrational numbers which are greater than three), so A is not 


closed. i ‘ 
(b) The set (0, 5! = (0; 5 

1 1 

5 contains points of A which exceed 5 which means that (0, ral is not the intersection of 


1 
| NA, so (0, 5 is closed in A. Every open interval containing 


1 
an open set with A, so (0, =] is not open in A. 


(c) Since (7, 27) is open, (a, 27)NA is open in A. Since 7, 27 ¢ A, [m7, 27]NA = (7, 20) A 
is also closed in the subspace A. 


53 


Example 3.4. (a) Give an example, with proof, of a set which is both open and closed. 
(b) Give an example, with proof, of a set which is neither open nor closed. 
(c) Let S = (0,1) U{3}. What are S,S° and 0(S)? 


Solution. (a) R is both open and closed. For every p € R, we know (p—1,p+1)CR,soR 
is open. We also know R is closed since it contains all points of R and thus all limit points 
of R. 

(b) (0, 1] is neither open nor closed. It is not open since for every € > 0, the interval 
(1—e,1+€) contains 1+ e which is not in (0, 1]. It is not closed because 0 is a limit point 
of (0, 1] which is not contained in (0, 1]. We can see this because for each € > 0, the interval 
(0 —€,0+€) (0, 1] contains are 


(c) 5 = [0,1] U {3}, 8° = (0,1) and €(S) = {0, 1,3}. 


We did not prove (c) in the preceding example. It might be instructive for the reader 
to do so. It may also be worth noting that the only sets which are both open and closed as 
subsets of the real numbers are the set of real numbers and the empty set, though we are 
not going to prove that yet. 


Theorem 3.16. Let ACR. Then A is a closed set if and only if for every {xy} C A so 
that {an} converges to some point p, it is true that p € A. 


Proof. Assume A is closed and {x,} > p. If p € {x,} then p € A. Otherwise, p is a limit 
point of A by Theorem 3.14, so p € A since A is closed. 

Assume that for every {x,} C A so that {x,} converges to some point p, it is true that 
pé€A. Let p bea limit point of A. Then by Theorem 3.14, we know that there is a sequence 
{tn} C A\ {p} which converges to p. Thus, p € A. 


Theorem 3.17. If U. is open for alla € J then U Ug is open. 
acd 


Proof. Let p € U Uy. Then p € Ug for some 8 € J, so for some € > 0 it follows that 
aed 


(p—e,pte) CUBS U Ua, so U U., is open. 
acd acd 


n 


Theorem 3.18. If Uj, U2,...,Un are open sets then () U; 1s open. 


i=1 


n 
Proof. If p € () U; then for each 7 < n we can find e; > 0 so that (p — &,p +e) C Ui. 


i=1 
n 


Hence, if we set € = min(é1, €2,...,€n) then (p—€,p+e) C () Uj. 
i=l 


54 CHAPTER 3. SEQUENCES 


Theorem 3.19. If Ag is closed for alla € J then () Ay is closed. 
acd 


Proof. By DeMorgan’s Laws, R \ () Ag = U R \ Ag, which is open by Theorem 3.17. 
acd acd 
Hence, () Ag is closed. 
acd 


n 
Theorem 3.20. Jf Ai, Ag,...,An are closed sets then U A; is closed. 
i=1 


n n 
Proof. By DeMorgan’s Laws, R\ U A; = () R\ A;, which is open by Theorem 3.18. Hence, 
i=1 i=1 


U A; is closed. 
i=l 
L] 


It is not true, in general, the the union of closed sets is closed or the intersection of open 


(oe) 
1 1 
sets is open if the collections of sets are infinite. For instance fl (——, —) = {0}, which is 


n=1 
not open, and U {x} = (0,1) which is not closed. 
x€(0,1) 


Theorem 3.21. Let S be a set which is bounded above and let | = sup(S). Then there is 
a sequence of points {x,} C S which converges tol. If S is bounded below and | = inf(S) 
then there is also a sequence of points {x,}C S which converges to l. 

Proof. First, assume | = sup(S). By the Approximation Property, for each n € N we can 
choose %, € S so that | — — < x, <1. By the Squeeze Theorem, we know that {x,} — 1. 


n 
Next, assume | = inf(S). By the Approximation Property, for each n € N we can choose 


In € S so that | < xr, <1+—. By the Squeeze Theorem, we know that {x,}— I. 
n 


Theorem 3.22. Let A be a closed set. If A is bounded below then A has a first point. If A 
is bounded above then A has a last point. 


Proof. If A is bounded above with | = sup(A) or A is bounded below with | = inf(A), then, 
in either case, by Theorem 3.21, there is a sequence {z,} C A which converges to |. Hence, 
by Theorem 3.16 we know that | € A. 


The following theorem is important, and we include three proofs for readers who might 
find one proof strategy to be more intuitive than the others. 


Theorem 3.23. The Bolzano-Weierstrass Theorem. Every bounded sequence has a convergent 
subsequence. 


59 


Proof. Let {x} be a bounded sequence. We first show that {2,, } has a monotone subsequence. 
Let S = {n € N|x; < x, for at most finitely many integers i}. If S is infinite then let n, be 
the first element of S and note that since S is infinite and all but finitely many elements of 
S exceed n, by definition of S, we can find ng > n1 so that no € S and ry, > ¢p,. Similarly, 
if we have chosen nj < nz < ... < nz so that each nj © S and %, < py < ... < Tp, then 
we can find ng+1 € S so that npyi > nz and %p,,, > Ln, So the subsequence {te} is 
increasing. 

If S is finite then let n, be the first integer exceeding all elements of S. Then by 
definition, since n; is not an element of S, we know there are infinitely many integers 7 
so that 7 < %p, so we can pick ng > ni so that %n. < Xp,. Similarly, if we have chosen 
my < ng <.. < ng so that %, > In. >... > Lyn, then we can find ng+1 > nz so that 
Ing, <n, 80 the subsequence {xn,} is non-increasing. 

By Theorem 3.12, every monotone bounded sequence converges. Hence, {z,} has a 
convergent subsequence {2n, }. 


The following proof is slightly less brief, but it is easier to draw a picture of, which is 
often helpful. 


Proof. Since {x,,} is bounded, we can find lower and upper bounds a; and 6; for this 


; ay +b gta 
sequence so that {tp} C [ai1,b1]. Then if x, € [a1, : 5 | for infinitely many integers 
ay +b 
n then we set [a2, b2] = [a1, ist aT Otherwise, since there are infinitely many natural 
ay + by 


numbers, it follows that x, € [ , by] for infinitely many integers n, and we set |ag, be] = 


b 
i= = = bil: If we have chosen nested intervals [a;,b;] for all 7 < k so that each [aj, bj] 
bij —a 
contains infinitely many terms of {z,} and b; — aj = a = and ay < ag < ...a, < 


ap + by 


by < bp_y <... <b then if x, € [az, 
ay + by 


] for infinitely many integers n then we set 
ap + by 
2 


[@n41, be41] = [ax, ]. Otherwise, set [ax41, bei] = [ , bz], and note that the 
aforementioned properties hold for 7 <k+1. 

Choose %n, € [a1, 61]. If we have chosen x, € [a;, b;] for i < k so that ny < ne... < nk, 
then since rp € [ag+1, b¢+1] for infinitely many integers n, we can choose ap,,, © [an+1, b¢-+1] 
so that np41 > ng. Then {xp,} is a subsequence of {x,}. Since {a,,} is increasing and 
bounded above by every 6;, it follows that {a,} converges to p = sup({an}). We know 
that {b, — a,}— 0 and hence {b,,} — p by exercise 3.8. Hence, by the Squeeze Theorem, 


{in,} — p. 


We give another proof which is somewhat more topological (more based on open and 
closed sets and limit points). 


Proof. Suppose {x,,} has no limit point. Then every subset of {x,} has no limit point 
by Exercise 3.14 and is therefore closed. By Theorem 3.22, we can find n; so that vp, 
is the least element of {z,,}. Likewise, {;|i > ni} has a least element zr, > Up, where 
ng > nm. Then choose zp, to be the least element of {x;|¢ > ng}. Continuing in this 


56 CHAPTER 3. SEQUENCES 


manner we construct a non-decreasing bounded subsequence {2n, } of {%} which converges 
by Theorem 3.12. 
If {ap} has a limit point p then choose ni € N so that rn, € (p —1,p+1). Assume we 


have chosen ny < ng < ... < nz so that each zp, € (p — ~,p+ —) for all positive integers 
i i 


i <k. Since p is a limit point of {x,} we know (p ,pt+ ) contains 2; for infinitely 
k+1 k+1 ; fi 
int } hi that € . Th 
many integers i so we can choose ng41 > ng so that %p,,, € (p pa Pts a 7) e 


subsequence {z,,,} thus chosen converges to p by the Squeeze Theorem. 


It may be worth noting that in the case where {z,,} has no limit points in the last proof, 
the sequence {2,,} is constant after some point (why would this be the case?) 


Definition 22 


We say that {z,,} is a Cauchy sequence if for every € > 0 there is an k € N so that 


if n,m > k then |rp — &m| < €. 


Theorem 3.24. Let {x,} be a Cauchy sequence. Then {x} is bounded. 


Proof. Choose k € N so that if n,m > k then |r, — 2| < 1. Then, if we let m = 
min{x1, 2, ...,¢k-1,2~ — 1} and let M = max{z1,2o,...,2~-1,% + 1}, by definition of 
minimum and maximum we see that if i < k then m < 2; < M. Likewise, if i > k then 
since |x; — x~| < 1 it follows that m < x; < M, so {z,} is bounded. 


Theorem 3.25. Let {x,,} be a sequence of real numbers. Then {x»,} converges if and only 
if {v,} is a Cauchy sequence. 


Proof. First, assume {2} — p and let e > 0. Choose k € N so that ifn > k then |xp — p| < 
_ Then if n,m > k it follows that |v, — @%m| = |an —p+p—Zm| < |an —p| + |@m — | < €. 
Hence, {2,,} is a Cauchy sequence. 
Next, assume {z,,} is a Cauchy sequence. Then by Theorem 3.24 we know {zy} is 
bounded, so by the Bolzano Weierstrass Theorem this sequence has a convergent subsequence 
€ 
{%n,} — c. Choose ky so that if i > k, then |rn, — c| < 5 and ko so that if 7,7 > ko then 


[See Sate os By exercise 3.12 it then follows that if i > max{k,,k2} then |x; — cl < 


|x; — fn,| + |an, — cl] < ¢. Thus, {ap}. 


The main value of a Cauchy sequence is to look at sequences that in some sense should 
converge to something. In subspaces of the real numbers which are not closed it is possible 


57 


to find Cauchy sequences that do not converge. The points to which they should converge 
are missing from the space. A metric space where every Cauchy sequence converges is 
called a complete metric space, but we will not be discussing metric spaces in this text. 
Essentially, the completeness axiom of the real numbers causes every Cauchy sequence to 
converge. In fact, an equivalent axiom to the completeness axiom would be to state that in 
the real numbers every Cauchy sequence converges. 


Definition 23 


Let {a,} be a sequence. For any natural number k we say that the subsequence 


{an+k} is a tail of the sequence {ay}. 


The following theorems can be helpful when using series convergence tests. They are 
not needed right away, so their proofs are left as exercises. 


Theorem 3.26. Let {a,,} be a sequence and k be a natural number. Then {a,,} > L if and 
only if the sequence tail {ani~} > L. 


Theorem 3.27. Let {a,,} be a sequence and k be a natural number. Then {an} is bounded 
above if and only if the sequence tail {an1,} is bounded above, and {a,} is bounded below 
if and only if the sequence tail {ani~} is bounded below. 


Many ideas in calculus involve sequences that diverge in a specific way, which can 
be referred to as diverging to infinity or negative infinity. Alternately, we can say such 
sequences converge to infinity or negative infinity if we think of infinity and negative infinity 
as extended real numbers. More is discussed about extended real numbers and infinite limits 
for functions in the Supplementary Materials. We will just address a few theorems that 
come up in the study of series and integration here. While the term ” converge to infinity” is 
used, a series that converges to infinity is divergent (it does not converge) so this terminology 
can be confusing. 


Definition 24 


We say that {x} — oo (respectively —oo), also written lim Ln = oO (respectively 
Tr [o-e) 


—oo), if, for every M € R there is an integer k € N so that ifn > k then x, > M 
(respectively x, < M). 


Theorem 3.28. [f {x,,} — 00 and {y,} is bounded below then {an + Yn} > oo. If {an} > 
—oco and {yn} is bounded above then {a + yn} 4 —o0. 


58 CHAPTER 3. SEQUENCES 


Proof. Let M > 0. If {a,} > oo and {y,} is bounded below by B then we can find k €¢ N 
so that ifn > k then 2, > M — B, so tn + yn > M, which means {zy + yn} > co. 

Let M <0. If {a,} — —oo and {y,} is bounded above by B then we can find k € N so 
that ifn > k then rz, < M — B, so &n + Yn < M, which means {x,, + yn} — —o0. 


Theorem 3.29. Let {x,,} be bounded and let {y,}— +00. Then ( > 0. 


Proof. Let € > 0. Choose M so that |x,,| < M for alln € N. Choose k € N so that if {yn} > 
M —M + 

oo then yp, > — and if {y,} + —oo then y, < ——. Ifn>k then Fa < Pan = €, So 
€ € n 


In 
—}- 0. 
ree 


Apart from showing that a finite set has a first and last point, it is not essential to cover 
the remaining material in this section to achieve a coherent understanding of advanced 
calculus (this material is not used much after this point), though it would be good to observe 
that the real numbers in an interval consisting of more than one point are uncountable, 
meaning that they cannot be listed as the range of a sequence. 

A section in the Supplementary Materials chapter proves what are probably the main 
theorems of interest about cardinality as the subject pertains to advanced calculus. Below, 
we just mention a few theorems that are more fundamental (and are thus included in the 
main body). 


Optional Content: Cardinality 


We often wish to talk about the number of elements in a set, and we would like to use 
ideas like the Pigeon Hole Principle in an argument or infinite variations of this idea, but if 
we wish to have access to these tools then we will need to formalize more. A more detailed 
and interesting development of this topic is found in the Supplementary Materials section. 

For those who wish to get a sense for the main results of interest in this discussion 
without moving further into the topic, we include arguments for why finite sets have first 
and last points, why the set of real numbers within any interval containing more than one 
point is uncountable, and also why the rational numbers are countable. 


Definition 25 


We say |A| = |B| or that sets A,B have the same cardinality if there is a one 
to one and onto mapping from A to B. Let {1,2,3,...,n} denote {7 © N|i < n}. If 
|A| = |{1,2,..,n}] for some natural number n, then we say |A| = n. If |A| = n for 


some non-negative integer n then we say that A is finite. If A is non-empty and not 
finite then we say that A is infinite. If |A| = |N| then we say that A is countably 
infinite. If A is countably infinite or finite then we say that A is countable. If A is 
not countable then we say that A is uncountable. 


59 


Theorem 3.30. Let S be a finite non-empty subset of R. Then S has a first and last point. 


Proof. We proceed by induction on the cardinality n of S. If S has one point then this is 
its first and last point. Assume that every set of cardinality k has a first and last point for 
some k € N. Then let |S| = &+1 and let f : {1,2,...,4 +1} — S be one to one and onto. 
Let g: {1,2,3,...,k} > S\{f(k + 1)} be defined by g(i) = f(z). Then g is one to one 
and onto since f is one to one and onto (because different points map to different images 
under f and hence under g, and every point in S \ {f(k + 1)} is mapped to by a point of 
{1, 2,3,...,4} under f and therefore under g), which means |S \ {f(k + 1)}| =k. Then the 
range of g has a first point m and a last point M by the induction hypothesis. Thus, for 
alll <i<k+1 it is true that min{m, f(k+1)} < f(i) < max{M, f(k+1)}, so S has a 
first point and a last point. By induction, it follows that every finite set has a first point 
and a last point. 


While there are interesting proofs that the real numbers are uncountable using decimals 
which are more intuitive than the one below (such a proof is developed in the Supplementary 
Section), we can still prove that the real numbers are uncountable using only the theorems 
we have developed thus far. The disadvantage to using the decimal argument is that to 
make the argument rigorous we must prove theorems about decimal representations of real 
numbers (which are also in the Supplementary Materials). 


Theorem 3.31. Let S be a subset of R containing the interval (a,b). Then S is uncountable. 
In particular, R is uncountable. 


Proof. Suppose S is countable. We know S is non-empty since a < b. If S is finite there 
is a one to one and onto function g : {1,2,3,...n} > S for some n € N, in which case we 
can extend g to a function f : N > S which is onto by setting f(i) = g(1) if i > n and 
f(t) = g(t) ifi < n. Thus, whether S is finite or countably infinite we can find a function 
f :N-—S which is onto. Setting f(z) = x; for each i € N we can write S = {x1, 22, 73,...}. 

Choose n so that rp, € (a,b). Let nz be the first positive integer so that tn, < tn. < b. 
Let ng be the first positive integer so that tn3 € (Ln,, Xn). In general, if n; has been chosen 
for 1<i<k then let nz41 be the first positive integer so that rp,,, is between rp,_, and 
Ln,- Note that tn, < tn, < Ing < ... and Zn, > Ln, > ... and that if j is even and &k is 
odd then zp, > %n,. Then by Theorem 3.12, we know that {%n.,,,} + U = sup{%ng;,,}, 
where u < Zp, if 7 is even and u > &p, if 7 is odd by the Comparison Theorem, which 
means u ¢ {Xy,}. Since u € (a,b) and f is onto, there is an integer s so that u = z,. But 
then if u # @», for any 7 < s, by construction we would have chosen z,, = u, which is a 
contradiction to the fact that u ¢ {x,,}. Hence, S is uncountable. 


We next wish to establish that Q is countable. Our first argument for the countability 
of Q only works if the reader is willing to accept the existence of the function based on the 
diagram below, and is not rigorous. We are not referring to this as a proof because of the 
missing details, but it is still instructive. 


Explanation for why the set Q of rational numbers is countably infinite: 


60 CHAPTER 3. SEQUENCES 


By setting f(n) to be the nth rational number encountered by the following arrow path 
which has not yet been mapped to by an earlier natural number, we get a one to one and 
onto function from N to Q, meaning that Q is countably infinite. 


3 2 < 1 0 - 1 27> 3 

t a a 5 

—3 —2 —1 3 0 , 1 2 3 

2 2 2 2 2 2 2 
1 ih 

—3 —2 : —1 ‘ 0 : 1 ; 2 3 

3 3 3 3 3 3 3 


To make the preceding argument rigorous is certainly possible and could be achieved by 
describing in words exactly how images that come from the arrow diagram are chosen 
(with a properly defined algorithm or formula instead of a picture) and then proving 
that the algorithm for such choices creates a one to one and onto function. Since this 
description seems to be a bit awkward, we use the procedure in the following argument to 
formalize this theorem instead. Another (nicer) development of this result is included in the 
Supplementary Materials section, but the argument below requires no further development. 


Theorem 3.32. The set Q of rational numbers is countable. 
Proof. We define f : N > Q inductively as follows. Let f(1) = 0, f(1) =1 and f(2) = —1. 


If f(¢) has been defined for all {i € N|1 <i < k} for some k > 2 then we let s be the 
least natural number so that there is a rational number ? # 0 having the property that 
q 


\p| + |q| = s and f(i) F : for any i < k. Pick any p,q € Z so that f(i) 4 P for anyi<k 
qd 


and |p| + |q| = s and assign f(k +1) = a 


Since each choice of f(k + 1) is a rational number which is not f(i) for any i < k, it 
follows that f is one to one. Note that for any integer s > 2, there are only 2s — 2 distinct 
pairs of non-zero numbers a € N,b € Z having the property that |a| + |b| = s (specifically, 


+1,5—1), (£2,s—2) ,... (£(s—1),1)). Hence, all rational numbers = having the property 


“——~ 


that |a|+|b| = s have been mapped to by positive integers 7 so that i < s(s—1) by Theorem 
2.2 because all such rational numbers have been chosen as images of integers within the 
first 1+2+4+4+6+...+2s—2 natural numbers. Since all rational numbers are in the range 
of f, it follows that Q is countably infinite. 


61 


Exercises: 


Exercise 3.1. Let {x,} be a sequence. Then {x,}— c if and only if {|an — cl} > 0. 


Exercise 3.2. An open interval (a,b) is an open set. 


Exercise 3.3. A closed interval [a,b] is closed set. 


Exercise 3.4. Let S C R and letp © R. Then p is a limit point of S if and only if every 
open set containing p contains a point of S distinct from p. 


Note that the condition in the preceding exercise is usually used as the definition of a 
point p being a limit point of a set S in general. 


Exercise 3.5. Let 1 = sup(S). [fl ¢ S then there is an increasing sequence {x,} C S so 
that {x} — l. 
Let b= inf(T). Ifb ¢ T then there is a decreasing sequence {xn} C S so that {an} — b. 


Exercise 3.6. Let {x,} be a sequence of real numbers. Then {x} — p if and only if every 
open interval containing p excludes x, for at most finitely many positive integers n. This 
is true if and only if for every « > 0 there are at most finitely many positive integers n so 
that t, ¢ (p—€,pte). 


Exercise 3.7. Any interval whose right endpoint is b has a supremum equal to b. Any 
interval whose left endpoint is a has a infimum equal to a. 


Exercise 3.8. If {an} — a and {an + bn} > a+b, then {bn} > b. 


Exercise 3.9. If {a,}—a#0 and {anb,} > ab, then {b,} > b. 


Exercise 3.10. Find an example of sequences {x,,} and {y,} which both diverge, such that 
{2nYn} converges. 


Exercise 3.11. Let r be a real number. There is a sequence of rational numbers converging 
tor. 


62 CHAPTER 3. SEQUENCES 


Exercise 3.12. Let {xn,} be a subsequence of {tn}. Then nj > 1 for eachiEN. 


Exercise 3.13. If |r| < 1 then {r"} > 0. 


Exercise 3.14. Let p be a limit point of a set A and let AC B. Then p is a limit point of 
B, 


Exercise 3.15. Let p be a limit point of AUB. Then either p is a limit point of A or p is 
a limit point of B. 


Exercise 3.16. Let E be a set and let AC E. Then E \ A is open in E if and only if A 
is closed in E. 


Exercise 3.17. Let A; be closed, non-empty and bounded for each natural number i, so 


CO 
that A, D Ag D A3.... Then () A; is non-empty. 
i=l 


Exercise 3.18. Let S be a bounded infinite set. Then S has a limit point. 
Exercise 3.19. The sets Z,R,Q are infinite sets. 


Exercise 3.20. Let {a,} be a sequence and k be a natural number. Then {a,}— L if and 
only if the sequence tail {anip} > L. 


Exercise 3.21. Let {a,} be a sequence and k be a natural number. Then {an} is bounded 
above if and only if the sequence tail {ani~} is bounded above, and {an} is bounded below 
if and only if the sequence tail {an1,} is bounded below. 


63 


Hints: 


Hint to Exercise 3.1. Let {x,} be a sequence. Then {x,} > c if and only if {|a,—c|} > 0. 


Use Theorem 3.3. 


Hint to Exercise 3.2. An open interval (a,b) is an open set. 


Take an arbitrary point p in (a,b) and find an € small enough so that (p—e, p+e) C (a,b). 


Hint to Exercise 3.3. Let S C R and let p € R. Then p is a limit point of S if and only 
if every open set containing p contains a point of S distinct from p. 


Use the result of Exercise 3.2. 


Hint to Exercise 3.4. A closed interval [a,b] is closed set. 


Show that open rays are open. 


Hint to Exercise 3.5. Let 1 = sup(S). If 1 ¢ S then there is an increasing sequence 
{tn} CS so that {zn} 1. 
Let b= inf(T). Ifb ¢ T then there is a decreasing sequence {xn} CS so that {an} — b. 


Parallel the proof of Theorem 3.21 


Hint to Exercise 3.6. Let {x,,} be a sequence of real numbers. Then {x} — p if and only 
if every open interval containing p excludes x, for at most finitely many positive integers n. 
This is true if and only if for every € > 0 there are at most finitely many positive integers 
n so that tn € (p—e,p+e). 


Write the definition of convergence, and note that the first k sequence terms is a finite 
set. For the other direction, remember that if there is a finite number of sequence terms 
excluded by (p — €,p + €) then the corresponding indices are a finite set, meaning there is 
a last index of an excluded point. 


Hint to Exercise 3.7. Any interval whose right endpoint is b has a supremum equal to b. 
Any interval whose left endpoint is a has a infimum equal to a. 


Use the definition of interval, supremum and the fact that there is a real number between 
any two numbers (specifically, a rational number has been shown to be between any two 
numbers). 


Hint to Exercise 3.8. If {a,} > a and {an + b,} > a+b, then {b,} > b. 


64 CHAPTER 3. SEQUENCES 


Use the sum rule for sequence limits. Remember that you cannot assume that {b,} 
converges. 


Hint to Exercise 3.9. If {a,} > a 40 and {anb,} — ab, then {b,} — b. 


Use the product rule for sequence limits. Remember that you cannot assume that {b,,} 
converges. 


Hint to Exercise 3.10. Find an example of sequences {x,} and {yn} which both diverge, 
such that {XnYyn} converges. 


There are examples where {x,,} diverges and {(x,)?} converges. It might be easier to 
think of such a sequence. You want one sequence to move points in the other close to the 
remaining points of that sequence when the sequences are multiplied together. 


Hint to Exercise 3.11. Let r be a real number. There is a sequence of rational numbers 
converging to rT. 


Start with a sequence converging to r and use the fact that there is a rational number 
between any two points and apply the Squeeze Theorem. 


Hint to Exercise 3.12. Let {x,,} be a subsequence of {rt}. Then nj > 1 for eachi EN. 


Use induction. 


Hint to Exercise 3.13. Jf |r| <1 then {r"} > 0. 


First, explain why {|r|"} converges (what kind of sequence is it?). Then consider the 
subsequence {|r|"t'} of {|r|"}. What does this subsequence converge to according to 
the product rule for limits? What does it converge to according to the theorem that a 
subsequence converges to the same point as the sequence it is a subsequence of? 


Hint to Exercise 3.14. Let p be a limit point of a set A and let AC B. Then p is a limit 
point of B. 


Use the definition of limit point and subset. 


Hint to Exercise 3.15. Let p be a limit point of AU B. Then either p is a limit point of 
A orp is a limit point of B. 


Assume p is not a limit point of A, and explain why it must be a limit point of B. 


Hint to Exercise 3.16. Let E be a set and let AC E. Then E \ A is open in E if and 
only if A is closed in E. 


65 


Use the definitions of open and closed in F. If there is an open set U so that UNE =A 
then what does this say about R\ UN E? 


Hint to Exercise 3.17. Let A; be closed, non-empty and bounded for each natural number 


[oe) 
i, so that Ay D Ap D Asz.... Then () A; is non-empty. 
i=1 


Take a point in each A;. The points chosen form a bounded sequence. What is known 
about bounded sequences? If a set is closed, what is known about the limit of a sequence 
of points from that set? 


Hint to Exercise 3.18. Let S be a bounded infinite set. Then S has a limit point. 


Use the Bolzano-Weierstrass Theorem. 


Hint to Exercise 3.19. The sets Z,R,Q are infinite sets. 


Do any of these sets have last points? 


Hint to Exercise 3.20. Let {a,,} be a sequence and k be a natural number. Then {ay} — L 
if and only if the sequence tail {ani,~} > L. 


Try looking at what happens in the sequence k indices later and compare that to 
corresponding sequence points in the tail of the sequence. 


Hint to Exercise 3.21. Let {a,} be a sequence and k be a natural number. Then {an} is 
bounded above if and only if the sequence tail {an+~} is bounded above, and {ay} is bounded 
below if and only if the sequence tail {an,,} is bounded below. 


Consider that the difference between a sequence and its tail is only finitely many points, 
and finite sets have first and last points. 


66 CHAPTER 3. SEQUENCES 


Solutions: 


Solution to Exercise 3.1. Let {x,} be a sequence. Then {x} — c if and only if {|an — 
c|} > 0. 


Proof. We know {x,} — c if and only if {z, — c} > 0 by Theorem 3.3, which is true if 
and only if, for every « > 0 there is an integer k so that ifn > k then |x, —c| < €. Since 
|2n — c| = ||an — cl — 0], this is true if and only if {|zp — c|} > 0. 


Solution to Exercise 3.2. An open interval (a,b) is an open set. 


Proof. Let p € (a,b) and let « = min{p—a,b—p}. Then p—e > p—(p—a) =a and 
pte<p+(b—p)=b. Thus, (p—e¢,p+e) C (a,b). Hence, (a, b) is open. 


Solution to Exercise 3.3. A closed interval [a,b] is closed set. 


Proof. By Exercise 3.2, we know that (b,b +n) and (a — n,a) are open for each natural 
(oe) 


number n. Hence, U = U (a —n,a)U (b,b +7) is open by Theorem 3.17. If M > b then 


n=1 
since N is unbounded, there is some n € N so that n > M — b, sob+n > M, which means 
M ¢€U. By a similar argument, if M <a then M € U. Hence U =R \ [a,b], which means 
that [a,b] is closed. 


Solution to Exercise 3.4. Let S C R and let p € R. Then p is a limit point of S if and 
only if every open set containing p contains a point of S' distinct from p. 


Proof. Assume p is a limit point of S. Let U be an open set containing p. Then for 
some € > 0 we know (p— ¢,p+.) C U, which means that there is some gq # p so that 
q€ (p—epte)NS C (UNS) (since p is a limit point of S$). 

Assume that every open set containing p contains a point of S' distinct from p. Let € > 0. 
Since the interval (p — €,p + €) is an open set by Exercise 3.2, it follows that (p — €,p + €) 
contains a point of S distinct from p, which implies that p is a limit point of S. 


Solution to Exercise 3.5. Let | = sup(S). [fl ¢ S then there is an increasing sequence 
{zn} CS so that {z,} > 1. 
Let b= inf(T). Ifb ¢ T then there is a decreasing sequence {x,} CS so that {x,}— b. 


Proof. Let | = sup(S), where 1 ¢ S. By the Approximation Property we can choose 

x, € SN(I-1, 1]. Since! ¢ S,1—1 < x1 <1. Likewise, we can choose x2 € (max(z;,!—1),/). 
1 

Continuing, we choose each x, so that max(l — —,%p,_-1) < @, </ for alln > 1. Then {z,,} 


is increasing and converges to | by the Squeeze Theorem. 


67 


Let b = inf(S), where b ¢ S. By the Approximation Property we can choose 21 € 

SM[b,b+1). Since 6 ¢ S,b+1> 2, > b. Likewise, we can choose x2 € (b, min{xz1,b+ 1}). 
1 

Continuing, we choose each x,, so that min{b+ —,2,_1} > 2, > 6 for alln > 1. Then {z,,} 


is decreasing and converges to | by the Squeeze ‘Theorem. 


Solution to Exercise 3.6. Let {x,,} be a sequence of real numbers. Then {x,} — p if 
and only if every open interval containing p excludes x, for at most finitely many positive 
integers n. This is true if and only if for every € > 0 there are at most finitely many positive 
integers n so that ty ¢ (p—e,p+e). 


Proof. First, note that if p € (a,b) then by setting « = min{|p — al, |p — b|}, it follows that 
(p—e,p+e) C (a,b). Thus, if x, ¢ (a,b) for infinitely many integers n then 2, ¢ (p—e€, p+e) 
for infinitely many integers n. Likewise, if there is some € > 0 so that 2, ¢ (p—e€,p+e) for 
infinitely many integers n, then there is an open interval (namely (p—e,p+e)) the excludes 
tn for infinitely many integers n. 

Assume {2%} — p. Choose k € N so that ifn > k then |r, — p| < «. Then ifn > k we 
know x, € (p—€,p +e). Thus, the set S of integers i so that x; ¢ (p — €,p + €) is a subset 
of {1,2,...,4 — 1}, which is finite, and thus S is finite. 

Assume that, for every « > O there are only finitely many integers n so that rz, ¢ 
(p—€,p+e). Then there is a last integer k —1 so that xp_1 ¢ (p—€,p+e). Hence, ifn >k 
then x, € (p—€,p +e), so |x, — p| < €, which means that {x,} > p. 


Solution to Exercise 3.7. Any non-empty interval whose right endpoint is b has a 
supremum equal to b. Any interval whose left endpoint is a has a infimum equal to a. 


Proof. If the right end point of an interval J is b then for some number a < b, a € I so all 
numbers between a and 0 are in J. Thus, if c < b then there is a rational number q between 
max(a,c), and b, so gq € J and c is not an upper bound for J. Since J contains no points 
greater than its right end point, 6 is the least upper bound for J. 

If b is the left end point of an interval J then as before, we can find a number in I less 
than any number exceeding b, and it follows that b is the greatest lower bound of J. 


Solution to Exercise 3.8. If {a,}— a and {an + b,} > a+b, then {b,} > b. 


Proof. By the sum rule (and product rule) we know that {(an + bn) — an} > a+b-a, 
which means that {b,} — b 


Solution to Exercise 3.9. If {a,} > a#0 and {anb,} — ab, then {b,} > b. 


68 CHAPTER 3. SEQUENCES 


L 1 
Proof. By product and quotient rules we know that {—}-— —, which means that {b,} = 
An a 


{~(anbn)} + (ab)(=) = 


n 


Solution to Exercise 3.10. Find an example of sequences {x} and {yn} which both 
diverge, such that {anYyn} converges. 


Proof. We will use tp = yn = (—1)”. Then rpnypn = 1 for each n € N so {xnyp} converges. 
On the other hand {(—1)"} diverges since given any k € N we know that |(—1)*—(—1)**1| = 
2, so {(—1)”} is not a Cauchy sequence and therefore cannot converge. 


Solution to Exercise 3.11. Let r be a real number. There is a sequence of rational 
numbers converging to r. 


1 1 
Proof. For each n € N we can choose a rational number gq, € (r ,r + —) since we have 
shown there is a rational number between any two real numbers. By Theorem 3.1 and the 
1 1 
sum rule for sequence limits we know that {r——}— r and {r+—}- r, so by the Squeeze 
n n 


Theorem it follows that {qn} — r. 


Solution to Exercise 3.12. Let {x,,} be a subsequence of {tn}. Then n; > i for each 
teEN. 


Proof. We know that {n;} is an increasing sequence of natural numbers by the definition 
of subsequence. Proceed by induction. First, ny € N so n; > 1. Assume that nz > k for 
some natural number k. Then since ngi1 > np => k and k +1 is the first natural number 
which exceeds k, it must follow that ng4, >k +1. The result follows by induction. 


Solution to Exercise 3.13. If |r| <1 then {r"} > 0. 


Proof. First, since |r| < 1 we know that |r"*1| = |r||r"| < |r”|, so {|r"|} is a decreasing 
sequence which is bounded below by 0. Thus, from the Monotone Convergence Theorem we 
know that {|r”|} converges to its greatest lower bound L, and that L > 0 by the Comparison 
Theorem. We note that {|r”*+|} is a subsequence of {|r”|}, and so by Theorem 3.11, we 
know that {|r|} > L. However, {|r|} = {|r||r”|}, so by the product rule for sequence 
limits, {|r"*1|} > |r|L. It follows that L = |r|L. Hence, either r = 0 or L = 0. If r = 0 
then {r”} = {0} > 0. Otherwise L = 0, so {|r"|} > 0. From an Theorem 1.16 we know 
that —|r”| < r” < |r”|, so by the Squeeze Theorem it follows that {r”} —> 0. 


69 


Solution to Exercise 3.14. Let p be a limit point of a set A and let AC B. Then p is a 
limit point of B. 


Proof. Let « > 0. Then there is a point q € (p—€,p+€)NA \ {p}. Since A C B we know 
that gq € B. Thus, p is a limit point of B. 


Solution to Exercise 3.15. Let p be a limit point of AUB. Then either p is a limit point 
of A or p is a limit point of B. 


Proof. Assume that p is not a limit point of A. Then there is an e; > 0 so that (p—e1, p+e1) 
contains no points of A other than p. This means that for all € < €, it is true that (p—e, p+e) 
contains no point of A distinct from p. Since (p—e,p+e) contains a point of AU B distinct 
from p, it must follow that (p — €,p +) contains a point of B distinct from p. Thus, for 
every y > 0 there is an « < min{e,, y} and there is a point q € (p—e,p+6)NB\ {p} Cc 
(p—y,p+7) NB \ {p}, so p is a limit point of B. Hence, either p is a limit point of A or p 
is a limit point of B. 


Solution to Exercise 3.16. Let EF be a set and let AC E. Then E \ A is open in E if 
and only if A is closed in E. 


Proof. Let E \ A be open in EF. Then there is an open set V so that VN E = E \ A. Since 
R \ V is closed, it follows that (R\ V)N E = A is closed in E. 

Let A be closed in E. Then there is a closed set K so that KN E = A. Since R\ K is 
open, it follows that (R\ K)N E = E \ A is open in E. 


Solution to Exercise 3.17. Let A; be closed, non-empty and bounded for each natural 
(oe) 

number i, so that A, D Ag D A3.... Then () A; is non-empty. 
i=1 


Proof. Choose x, € Ap for each n € N. Then {2,} C Ay, which is bounded, so {z,} is 

bounded. Hence, {xz,} has a convergent subsequence {xn,} — p. This means, for each 

k €N, the subsequence {2n,,,} GC Ap by Theorem 3.12, so {2n,,,} — p by Theorem 3.11 

Since each A; is closed we know that p € A, for all k € N by Theorem 3.16, and thus 
[oe 


pe () Aj. 
i=1 


Solution to Exercise 3.18. Let S be a bounded infinite set. Then S has a limit point. 


Proof. Let S be infinite. Choose 7; € S. If we have chosen x; for 1 < k then choose 
Cri € S \ {11,12,713,...,%~}. Such a choice is always possible since S is infinite. Thus 
{tp} is a sequence of points of S, and x; ~ x; for all i € j. By the Bolzano-Weierstrass 
Theorem, {2,,} has a convergent subsequence {x,,,} — p. Thus, for every € > 0, the interval 
(p — €,p + €) contains infinitely many elements of {x,,} by Exercise 3.6 (each of which are 
elements of S$), so p is a limit point of S. 


70 CHAPTER 3. SEQUENCES 


Solution to Exercise 3.19. The sets Z,R,Q are infinite sets. 


Proof. None of these points have last points (or first points) so each of these sets is infinite 
by Theorem 3.30. 


Solution to Exercise 3.20. Let {a,} be a sequence and k be a natural number. Then 
{an} — L if and only if the sequence tail {aniz~} > L. 


Proof. If {a,} + L then since {a,4,} is a subsequence of {a,,} it follows that {ani} > L 
by Theorem 3.11. Next, assume {a,,4.} > L. Let ¢ > 0. We can choose N € N so that if 
n> WN then |aniz, —L| < ¢. Thus, ifn > N+k it follows that |a, — L| = |an_pip — L| <. 
Hence, {a,} —> L. 


Solution to Exercise 3.21. Let {a,} be a sequence and k be a natural number. Then 
{ay} is bounded above if and only if the sequence tail {ani~} is bounded above, and {ay} is 
bounded below if and only if the sequence tail {a,,,} is bounded below. 


Proof. If there is some M so that a, < M for alln € N then a, < M for all n > k, so if 
{an} is bounded above then {a,4,} is bounded above. 

If there is some M so that a, < M for alln > k then a, < max{ay, a2,...,a~%-1, M} for 
all n € N, so {an} is bounded above. 

If there is some M so that a, > M for all n € N then a, > M for all n > k, so if {ay} 
is bounded below then {a,,;} is bounded below. 

If there is some M so that a, > M for all n > k then a, > min{ay, a2,...,a¢-1, M} for 


all n € N, so {a,} is bounded below. 


Chapter 4 


Limits and Continuity 


Definition 26 


Let f : D > R, where D C R and ¢ is a limit point of D. Then we say that 
lim f (2) = L if for every e > 0 there is a d > 0 such that if0 < |x—c| <d andze D 
Lare 


then | f(x) — L| < . 


We say that f is continuous at the point z € D if for every € > 0 there is a 6d > 0 
so that if |x — z| < 6 and x € D then |f(x) — f(z)| < «. We say that a function 
f is continuous if it is continuous at every point in its domain. We say that f is 
continuous on the set E if f is continuous at every point of E. 


The preceding definition gives us another reason for using the term ”limit” point to 
describe a limit point. A point p is a limit point of set D if and only if there are functions 
with domain D that have a limit at the point p. People often think of continuity using the 
graph of a function as saying that a function is continuous if its graph has no breaks in it. 
That is an accurate definition (once the idea of having no breaks is formalized) if the domain 
of the function is an interval, but it is not correct in general. Using the definition above, 
we notice that there are some functions which are continuous that do not have connected 
graphs. For instance, if the domain of a function f is the integers then the function is 
always continuous, because for any integer k in the domain and « > 0, if |a —k| < 1 and 
x € Zthen « =k so |f(x) — f(k)| =0 <e. The graph of a function whose domain is the 
integers is just a discrete collection of points, but it is still a continuous function. Likewise, 


the function f(a) = z is continuous even though its graph is broken into two pieces. The 
value zero is not in the domain of that function, and for every value in the domain of the 
function, f(x) is continuous, making f a continuous function. A function can be continuous 
at a point of the domain where the limit does not exist. Specifically, functions are always 
continuous at points of their domains which are not limit points of the domain (and limits 
of those functions cannot exist as x approaches those points). 

Graphically, as long as, given any ¢-radius interval centered at L, there is a 6-radius 
interval centered at c so that portion of the graph of y = f(x) whose x-coordinates lie 
within (c — 6,c + 6) has y-coordinates within (f(c) — «, f(c) + ©), meaning that the section 


71 


72 CHAPTER 4. LIMITS AND CONTINUITY 


of the graph of f inside the vertical band between lines x = c— 6 and x = c+ 6 also lies 
within the horizontal band between y = f(c) — ¢« and y = f(c) +. 


Continuity at c 


c P <Gezee)) 
Tce) | 
ee “5 


73 


Continuity at Isolated Point c 


¥ 
\ j e@ 
eC ae 
f(e)> | e | 
e ! ! 
6 to) 


The picture for lim f(x) = L is similar, except that c must be a limit point of the 
x->C 
domain, not an isolated point, and it need not be in the domain. Furthermore, the value 
of f(c) has no bearing on the limit if c is in the domain (only the points close to but not 
equal to c are relevant to the definition of function limit). 


74 CHAPTER 4. LIMITS AND CONTINUITY 


Limit at c 


We note, at this point, that it is instructive to look at trigonometric functions from 
time to time, but we are not developing trigonometry rigorously, and are assuming that 
properties of geometry and trigonometry were proven in another text and are now being 
assumed to be true. 


Theorem 4.1. The Sequential Characterization of Limits. Let f : D> R, where DCR 
and c is a limit point of D. Then lim f(x) = L if and only if for every sequence {x} C 
D\ {c}, if {an} > c then {f(an)} > L. 

Proof. First, assume that lim f(x) = L. Let {x,} C D\{c} such that {x,} > c. Let € > 0. 
Then for some 6 > 0, we iow that if 0 < |a—c| <6 and x € D then |f(x) — L| < «. Since 
{tn} — c, we can find N € N so that ifn > N then |r, — c| < 6, and since {z,,} C D \ {c} 
it follows that ifn > N then 0 < |z, —c| < 6. Hence, ifn > N then |f(z,) — L| < €, so 


{f(n)} > L. 
Next, assume that for every sequence {x,} C D \ {c}, if {v,} > c then {f(z,)} > L. 


Suppose that lim f(x) # L. Then we can find an € > 0 so that for every 6 > 0 there is some 
x € D\{c} so that |x—c| < 6 but | f(x) —L| > «. For each n € N we choose x, € D \ {c} so 
that |x, —c| < — and |f(z,) — L| > «. Since c— — < ay, < c+ Es we know by the Squeeze 
Theorem that ia +c. But {f(zn)} A L, contradicting our secuintiork 


79 


Theorem 4.2. The Sequential Characterization of Continuity. Let f : D — R, where 
DCR andce D. Then f is continuous at c if and only if for every sequence {x,} C D, 


if {an} — c then {f(an)} > fle). 


The proof is similar to that of the preceding theorem and is left as an exercise. 


Theorem 4.3. The functions f(x) =k and g(x) = x on domain D are continuous. 


Proof. Let c € D, and let {rz} — c for some {z,} C D. By Theorem 3.2 we know 
{f(an)} = {k} — k = f(c). Also, we know that {g(z,)} = {an} > c= g(c). Thus, f and 


g are continuous. 


Theorem 4.4. Let f : D— R, where DCR andce D. 
(a) Let c be a limit point of D. Then f is continuous at c if and only if lim Celeste 
wc 


(b) If c is not a limit point of D then f is continuous at c. 


Proof. (a) First, assume that f is continuous at c. Let « > 0. We know that for some 6 > 0, 
if jz — c| < 6 and x € D then |f(x) — f(c)| < «. Hence, if 0 < |x —c| < 6 and x € D then 
IF(e) — F(0)| <¢, 0 lim f(e) = f(e). 

Next, assume that lim f(x) = f(c). Let « > 0. We know that for some 6 > 0, if 


0 < |x—c| <6 and z€ D then |f(x) — f(c)| < ¢, but if s =c then |f(z) — f(c)| =0 < eas 
well. Hence, if |” —c| < 6 and x € D then |f(x) — f(c)| < €, so f is continuous at c. 

(b) Since c is an isolated point of D, we can find 6 > 0 so that the only point of D 
whose distance from c is less than 6 is c. Hence, if |x — c| < 6 and x € D then x = c, so 
| f(a) — f(c)| = 0 which is less than any positive number «¢ and therefore f is continuous at 
é. 


Theorem 4.5. (a) Let f : dom(f) > R be continuous at c and let f(c) 40. Then there is 


some 6 > 0 so that |f(x)| > FO) if x € (c—6,c+6)Ndom(f). 

(b) If f is continuous at c € (a,b) C dom(f) and f(c) 4 0 then there is some 6 > 0 so 
that (c— 6,c+ 6) C dom(f). 

(c) Let g : dom(g) > R. If lim g(x) = L # 0 then there is some 6 > 0 so that 


Igo 
2 


\g(ax)| > if x € (c—6,c+6) Ndom(g) \ {c} and c is a limit point of dom(”) 
Proof. (a) Choose 6 > 0 so that if |x —c| < 6 and x € dom(f) then | f(x) — f(c)| < a 


It follows that if  € (c— 6,c+6) Ndom(f) then f(x) > Mo) by the Triangle Inequality, 


76 CHAPTER 4. LIMITS AND CONTINUITY 


1 
80 = is defined. If c € (a,b) C dom(f) then set 6; = min{d,|c — al,|c — b|} then. 
1 
If  € (c— 61,c+ 61) then x € dom(f) and f(x) # O and therefore x € dom)» so 


1 
(c—61,¢+ 61) © aon) 
(b) Let lim g(x) = L £0. Choose 6 > 0 so that if 0 < |x —c| < 6 and x € dom(g) then 


[Z| [Z| 


1 
\g(x) -—L| < “3? 80 |g(x)| > 5 It follows that if x € (c—6,c+4)Ndom/(q) \ {c} then aa) 


is defined. Let « > 0. Let y = min{d,e}. Since c is a limit point of dom(g) we can find a 


1 
point g 4 0 so that g € (c—7,c+y) Mdom(g), which means that gq € dom(—)N(c—«,c+e) 
g 


1 
and therefore c is a limit point of —. 
g 


Example 4.1. (a) Let f(x) =a ifx #2 and let f(2) =5. Find lim, fAcae 
xt 
(b) Let f(x) =0 if x € Q and let f(x) =1 if © R\Q. Prove that f is discontinuous 
at every real number. 
(c) Let f : NN be defined by f(x) = tn for some sequence {xp}. At what points is f 
continuous ? 


Solution. (a) lim, f(x) = 2 since g(x) = x is continuous by Exercise 4.3 (the value of the 
be 
function at the point approached does not affect the limit). 
(b) Let r € R and let 6 > 0. Then (p—6,p+6) contains both an irrational number a and 


a rational number q. Thus, | f(a) — f(q)| = 1, so if |f(a)—f(p)| < ; then | f(p)— f(a)| > . 


1 1 
and if | f(a) — f(a)| < 5 then |f(p) — f(q)| > 5" Hence, there is no 6 > 0 so that for every 


x so that |x — p| < 6 it is true that |f(x) — f(p)| < _ Hence, f is not continuous at any 
point p. 

(c) f is continuous at every point in its domain. To see this, let « > 0. For any m € Z, 
if |e —m| <1 and x € dom(f) then x = m which means | f(x) — m| = 0 < e«. 


Theorem 4.6. Let f,g: D—R, where D CR and c is a limit point of D. Let lim f(x) =F 
and Tim g(x) = s. Then the following are true: 
(a) Sum Rule (for limits): lim af (x) + bg(a) = ar + bs 
(b) Product Rule (for limits): lim flejoley Hts 
fl) _r 


(c) Quotient Rule (for limits): If s #0 then lim a) as 


Proof. Let {x,} C D\{c} so that {x,} — c. Then by Theorem 4.1 we know that {f(2,)} > 
r and {g(a,)} + s, so by Theorem 3.9, {af (an) + bg(an)} > ar + bs, {f(an)g(@n)} > rs, 


77 


and if {a} C dom( 2) \ te} then (oe 


know that c is a limit point of me): Hence, by the Theorem 4.1, the result follows. 
g 


o> - (and if s #0 then by Theorem 4.5, we 


Note that each of these can be proven directly without using the Sequential Characterization 
of Limits. It is to our advantage to develop sequences for other proofs as well, and having 
developed sequences it is arguably a waste of time to duplicate everything for function 
limits. 

We may not always refer to Theorems 4.1 and 4.2 when we use them. Readers are 
encouraged to think of the sequential methods of characterizing limits and continuity as 
being more like a second definition than a theorem. The sum rule is usually written without 
the constants a and 6 multiplied by the functions, but is is convenient for us to include those 
constants. 


Theorem 4.7. Let f,g be continuous atc. Then f +9, fg are continuous at c and f is 
g 


continuous at c if g(c) £0. 


Proof. Let {x,} C D so that {x,} — c. Then by Theorem 4.2 we know that {f(z,)} > f(c) 
and {g(an)} — g(c). By Theorem 3.9, it follows that {f(rn) + g(an)} > fle) + g(c), 


{f(an)g(an)} > f(e)g(c), and if {apn} C ay then Matas) “0 


Thus, the result follows from Theorem 4.2. 


Theorem 4.8. Squeeze Theorem (for limits). Let f,g,h:D— R, where D CR and c is 
a limit point of D. If there is a 6, > 0 so that f(x) < g(x) < h(x) or h(x) < g(x) < f(z) 
for all x € D so that 0 < |x —c| < 61 and lim hana b= lim h(a) then lim g(a) = L: 


Proof. Choose 62 > 0 so that if 0 < |a —c| < 62 then |f(x) — L| < «. Choose 63 > 0 so 
that if 0 < |a —c| < 63 then |h(x) — L| < «. Let 6 = min(dy, 62,63). If 0 < |a—cel < 6 
then L—e < f(x) < g(x) < h(x) < L+e or L—€ < h(x) < g(x) < f(x) < L+e, so 
lg(x) — L| < €, so lim g(z) ab, 


As with sequences, the Squeeze Theorem for limits helps us evaluate many function 
limits. 


1 
Example 4.2. Prove that lim x sin(—) = 0. 
x—0 x 


1 
Solution. We assume it is known that —1 < sin(—) < 1 (we are assuming properties of 


trigonometric functions which are not the result of limits are known in this text). In that 


78 CHAPTER 4. LIMITS AND CONTINUITY 


i 1 
case xsin(—) is in the interval [—z,z] for allx 4 0. If e > 0 then —z < x sin(—) ae; 
x 


1 
and if « < 0 then x < zsin(—) < —z. Since limz = 0 = lim —z we conclude that 
x x0 z—0 


1 
lim xsin(—) = 0 by the Squeeze Theorem. 
a 


i ied 


Theorem 4.9. Comparison Theorem (for function limits). Let f,g: D—R, where DCR 
and c is a limit point of D. If there is a6 > 0 so that f(x) < g(x) for alla € D so that 
0 < |a—c| <6 and lim f(x) = s and lim g(x) =0 en st. 

a Cc au Cc 


Proof. Let {x} C D\{c} so that {x,} — c. Then by Theorem 4.1 we know that {f(x,)} > 
s and {g(z,)} — t. Choose k € N so that ifn > k then 0 < |r, —c| < 6. Ifn > k then 
f (an) < g(%n) so the hypotheses of the Comparison Theorem for sequences are satisfied, 
and hence s < t. 


Note that when we refer to the Comparison or Squeeze theorems we usually assume it 
is clear from context which form (sequences or limits) is being cited, so we typically do not 
say ”Squeeze Theorem for Limits” in arguments and instead just say "Squeeze Theorem.” 


Theorem 4.10. Let f,g: D— L and let c be a limit point of D. Let lim fe) Subs 
(a) If lim f(x) + g(x) =L+4+R then lim (oH 
(b) If L #40 and lim f(x)g(x) = LR then lim g(x) =R 
Proof. Let {x} C D\{c} so that {x,} — c. Then by Theorem 4.1 we know that {f(x,)} > 


L. Thus, by Exercises 3.8 and 3.9 we can conclude that {g(z,)} — R in (a) and (b) 
respectively, which means that lim g(t) =. 


Theorem 4.11. [fc is a limit point of dom(fog) and lim g(x) = L and f(x) ts continuous 
at L then lim f(g(x)) = f(L). If L is a limit point of the domain of f then lim f(g) = 
li : 

lim f(y) 

Proof. Let {x,,} C dom(fog)\{c} so that {z,} > c. Then by The Sequential Characterization 
of Limits we know that {g(rn)} — JL and since f is continuous at L we know that 


{f(g(an))} — f(L) by The Sequential Characterization of Continuity. Thus, by The 
Sequential Characterization of Limits we know that lim f(g(z)) = f(ZL). If L is a limit 
xL—->C 


point of the domain of f(x) then by Theorem 4.4, lim (2) fb) = lim f(g(z)). 
yo wc 


Theorem 4.12. Let f be continuous at g(c) and g be continuous atc. Then fog is 
continuous at c. 


79 


Proof. Let {x,} C dom(f og) so that {x,,} > c. Then by The Sequential Characterization 
of Continuity we know that {g(x,)} — g(c) and hence {f(g(x,))} > f(g(c)), so fog is 
continuous at c. 


There are advantages and disadvantages to proving theorems about infinite limits in a 
brief development of advanced calculus. For the most part, we will not need theorems on 
infinite limits, but we will develop them in more detail in the section on infinite limits in 
the Supplementary Materials section for those who are interested. A problem with these 
sorts of definitions is that the more general forms of theorems involving limits that could 
be infinite at points or possibly at infinity or negative infinity tend to take a fairly simple 
idea for a proof and require repetitions for many cases to make it rigorous. However, it 
is worthwhile to understand what these definitions are whether we spend a lot of time 
considering theorems about them or not. 


Definition 27 


Let f : D — R, where D is not bounded above. We say that lim face bate 
x [o-e) 


for every € > 0, there is an M so that if > M and x € D then |f(x) — L| < «. We 
say that lim f(x) = oo if, for every T, there is an M so that if x > M anda ce D 
®—00 


then f(z) > T. We say that lim f(x) = —oo if, for every T, there is an M so that 
«Loo 


if x > M and x € D then f(z) < T. 
Let f : D> R, where D is not bounded below. We say that jim fia) = fat, tor 


every € > 0, there is an M so that if ¢ < M and x € D then |f(x) — L| < «. We say 
that lim f(x) = oo if, for every T, there is an M so that if x < M and az € D then 
s—-—AS,9) 
f(x) > T. We say that lim f(x) = —oo if, for every T, there is an MW so that if 
Ai — 18,0) 
x<M andazeéD then f(x) <T. 
Let f : D> R, where c is a limit point of D. We say that lim (2) = Ook, for 
x (G 
every M ER, there is a 6 > 0 so that if 0 < |x —c| < 6 and z € D then f(z) > M. 
Similarly, we define lim f(x) = —oo if, for every M € R, there is a 6 > 0 so that if 
xw—->C 
0 < |x—c| <6 and ze D then f(z) < M. 


Note that if D = N, where f(n) = zx, for all n € N, then lim f(n) = 
N—-0o 


equivalent to {%,}— L. The details of this are left as an exercise. 


im xz, = Lis 
n—-oo 


Theorem 4.13. Sequential Characterization of Limits for Infinite Limits at real numbers. 
Let f : DR, where DCR and ¢ is a limit point of D. Then lim f (2) = oo (or —00 

respectively) if and only if for every sequence {x,,} C D\{c}, if {xn} > c then {f(xn)} > co 

(or —oo respectively). 

Proof. First, assume that lim f(x) = 00. Then let M € R. We can choose 6 > 0 so that if 

x € Dand 0 < |x—c| < 6 then f(x) > M. Choose k € N so that ifn > k then |x, —c| < 6. 


80 CHAPTER 4. LIMITS AND CONTINUITY 


Then since, {z,} C D \ {c} it follows that ifn > k then 0 < |r, — c| < 6 and f(z,) > M. 
Thus, {f(@n)} > co. 

Likewise, if lim 1 f(x ) =-o0o and M € R then we can choose 6 > 0 so that if  € D and 
0< |z-c¢| 2 6then f(x) < M. Choose k € N so that if n > k then |x, — c| < 6. Then 
since, {z,} C D \ {c} it follows that if n > k then 0 < |z, —c| < 6 and f(r,) < M. Thus, 


Next, assume that for every sequence {x,} C D \ {c}, if {z,} > c then {f(xn)} > oo 
(or —oo respectively). Suppose that lim f(x) 4 co (or —oo respectively). Then for some 
x—->C 


M ER, for every 6 > 0 we can choose x € D so that 0 < |x —c| < 6 but f(x) < M 
(or f(x) > M respectively). Thus, for every integer n € N we can pick x, € D \ {c} so 
that |x, — c| < — and f(a,) < M (or f(a,) > M respectively). Hence, {z,} — c but 


{f(xn)} A co (or —oo respectively), a contradiction. 


Theorem 4.14. Let f : D > R be a function so (a,oo) C D for some a € R. Then 
1 
lim f(x) = L if and only if lim f(—) = L. Likewise, if g : D > R is a function so 
r—-00 20+ x 
1 
(—co,a) CD then lim g(x) =L if and only if lim g(—) = L. 
~L——00 z>0- 
Proof. Let « > 0. Assume lim f(x) = L. Choose M > max(a,0) so that if ¢ > M then 
le @,@) 
1 1 1 
| f(x) - : <e. Then ifO<a2< i it follows that ae M so If) — L| < «. Hence, 
1 
lim AC ) =L. Similarly, if lim f(—) = L then we can choose 6 > 0 so that if0<a4<6 
20+ z>0t 62 1 
then He —I| <«. Thus, if M > max(a, 5? then if y > M we know that y € (a,co) 
- 
1 1 1 
and 0 < — < 6, so f(y) = f(—) for some x = — so that 0 < x < 6 which means that 
v y 
lf(y) — L| < e. Hence, lim f(z) = L. 
xL—>0O 1 
The proof that if g : (—co,a) ~ Rthen lim g(x) = L if and only if lim g(-) = 
~——00 a30-~ 2 
L is similar. Assume lim g(x) = L. Choose M < min(a,0) so that if « < M then 
«w~7—- CO 
1 1 1 
lg(z) — L| < e. Then if a 0 it follows that — < M so |g(—) — L| < «. Hence, 
e us 
1 1 
lim a ) =L. Similarly, if lim g(—) = L then we can choose 6 > 0 so that if -d < x <0 
z—07 x07 x 


—1 
then Be) —L| <e. Thus, if M < min(a, an then if y < M we know that y € (—oo,a) 
x 


and —d < — < 0, so g(y) = g(—) for some x = — so that —d < x < 0 which means that 
x 
lg(y) — L| <«. Hence, lim g(x) = L. 
wL->>—-CO 


There are other results about infinite limits that we do not need for this development 
which are addressed in the Supplementary Materials section. 


Sometimes it is helpful to specifically look at the portion of the domain of a function 
which is greater than or less than a point, and the corresponding one sided limit at that 
point obtained from such a restriction. 


81 


Definition 28 


Let f : D > R, and let A C D. We define the restriction of f to A to be the 
function g : A — R defined by setting g(x) = f(x) for all x € A. We use the 


notation f|4 to denote this restriction. We define lim. Ca lim f\DaA(coo) and 
Ic tC 

lun f(4) = lint f| Deleeon) if these exist (this definition holds for both finite and 

xc L>C : 


infinite limits). 


Another way of stating those two definitions (for finite limits) is that if c is a limit point 
of the points of D preceding c then we define Cae f(x) = L if for every € > 0 there is a 


6 >0so that ifc—d <a<cand2ze€ D then We y= L| < ¢, and if c is a limit point of the 
points of D greater than c then lim, f(x) = L if for every € > 0 there is a 6 > 0 so that if 
«w->C 


c<a<c+06 anda € D then |f(x) — L| < «. These limits are called one sided limits, or 
limits approaching from below or above (or from the left or right). 


Theorem 4.15. Let f : D—R, where DCR and c is a limit point of D, DM (c,o0) and 
DN (-oo,c). Then lim f(x) = L if and only if lim f(x) = L and lim, f(x oh = Ey 
=->C xL—>c— «wc 


Proof. First, assume that lim f(x) = L, and pick 6 > 0 so that if 0 < |x —c| < 6 then 
|f(z) -—L| <e. Then ifc—é6 <x <corc<2#<c+6 it follows that |f(x)— L| < «. Hence, 
lim f(z) =Land lim f(x) = 
xL—>c— x—ct 
Next, assume that lim f(x) = LD and lim, f(x) = L. Choose 61,62 > 0 so that if 
«w—->C 


ed OE 


c— 61 < « < c then |f(z) — L| < € and if c < x < c+ 62 then |f(x) — L| < «. Let 
6 = min(d1,62). Then if 0 < |x —c| < 6, it must follow that either c— 6, < x < cor 
c< “2 <c+t+0o, so |f(xz) — L| < ¢. Hence, lim fiH=p, 


By adjusting the domain in most of these limit theorems, we can see that most of the 
theorems we have proven are also proven for one sided limits (just restrict our hypothesis 
to domains containing points on only one side of c). 


If readers are sufficiently comfortable with more abstract topological ideas then it may be 
helpful to introduce the definition of compactness now. However, the results about compact 
and connected sets can also be ignored until later without losing much. We include these 
theorems in the Supplementary Materials section so that readers who are interested can 
look at the topological theorems first. It is almost certainly a good idea to learn these 
theorems eventually, but it is possible to wait until discussing more abstract ideas or even 
until generalizing theorems to R” if that is preferable. If the reader chooses to study 
the topological theorems now then some of the proofs of theorems in the remainder of 
this section can be abbreviated once these topological foundations are established. These 


82 CHAPTER 4. LIMITS AND CONTINUITY 


alternate proofs are in the Supplementary Materials chapter in the ” Topology of the Real 
Line” section. 


Theorem 4.16. The Extreme Value Theorem. Let f : K > R be continuous, where K is 
closed and bounded. Then there are points s,t € K so f(s) < f(x) < f(t) for everyxe K. 


Proof. Part 1: We wish to show that f is bounded. Suppose f is not bounded. Then 
for every n € N we may choose xz, € K so that |f(x,)| > n. Since {x,} is bounded, by 
the Bolzano-Weierstrass Theorem we can find a convergent subsequence {%n,}— p, where 
p © K because K is closed. Since f is continuous it follows that {f(xn,)}— f(p). But for 
every M > 0 we can find N € N so that N > M and thus |f(¢pn,)| > nn > N > M, and 
hence {f(#p,)} is not bounded and therefore cannot converge. This contradiction implies 
that f is bounded. 

Part 2: We wish to show that there is a point t € K so that f(t) = sup(f(K)). Let 
| = sup(f(K)). By the Approximation Property, for every n € N we can choose a point 


Zn € K so that 1 — as flén) < l. Since {z,} is bounded, by the Bolzano-Weierstrass 


Theorem we can find a convergent subsequence {2,,}— t € K (where t € K because K is 
closed). Since f is continuous, we know that {f(zn,)} — f(t), and by the Squeeze Theorem 
{f(Zn,)} 7 t. Hence, f(t) =1. 

Finally, by Part 2 we may find s € K so that — f(s) is the maximum of — f(A’) and thus 
f(s) < f(x) for alla e K. 


Theorem 4.17. Intermediate Value Theorem. Let f : [a,b] + R be continuous and let r 
be between f(a) and f(b). Then f(c) =r for some c € (a,b). 


Proof. First, assume f(a) < r < f(b). Let S = {x € [a,d]|f(x) < r}. Note thatae S$ 
and 6 is an upper bound for S, so S has a least upper bound c. Thus, for each n € N 
1 
we can choose x, € (c — —,c] so that rz, € S. By the Squeeze Theorem, {z,} — c, so 
by the Comparison Theorem {f(z,)} > f(c) < r. Hence, c < b. For x € (c,b) we know 
that f(x) > r, so by the Comparison Theorem we know that lim. flay Ss fle) > fo Tinis, 
xL—->C 

fl =r. 

Note that if f(b) <r < f(a) then —f(a) < —r < —f(b) so for some c € (a,b) we know 
that —f(c) = —r and therefore f(c) =r. 


The following is another proof. It is a little longer, but it is easier to see pictorially. 


Proof. Assume that f(a) <r < f(b). Let a; = a and b; = b. If f( 


ay +b ; a, +b 
a1 = ag and by = soe, Otherwise, set a2 = : E 


if we have chosen a;,b; for 1 <i< k so that ay < ag <.... < ap < Dy < Dp_y << OH, 


b 
a) > r then set 


and b; = bg. Proceeding inductively, 


83 


by — aj 
9i-1 
)>r then set az41 = az and bey = 


f(a;) <r < f(b) and b; — a = 


p(s and 
by = be41, and note that all the aforementioned properties hold when i = k+1. Since {a,} 
is bounded above by 6 and increasing, we know that {a,}— c = sup({an}) € [a,b] by the 
Monotone Convergence Theorem. Since {b,, — a,} — 0 we know that {b,} — c. Since f is 
continuous, {f(an)} > f(c) and {f(bn)} + f(c). By the Comparison Theorem, we know 
f(c) < r since f(an) < r for all n, and f(c) > r since f(b,) > r for all n, and therefore 
fo=r. 

If f(b) < r < f(a) then —f(a) < —r < —f(b) so for some c € (a,b) we know that 
—f(c) = —r and therefore f(c) =r. 


, then we choose a4, and bg11 as follows. If 


ay + 6 ap +b 
ae * Otherwise, set uit = kT Ok 


Definition 29 


Let f : D— R. We say that f is uniformly continuous on A C D if for every 


€ > 0 there is a 6 > 0 so that if z,y € A and |x — y| < 6 then |f(x) — f(y)| < «. We 
say that f is uniformly continuous if f is uniformly continuous on D. 


Theorem 4.18. Let f : K > R be continuous, where K is closed and bounded. Then f is 
uniformly continuous. 


Proof. Suppose f is not uniformly continuous. Then we can pick € > 0 so that for every 

5 > O there are points x,y € K such that |x—y| < 6 and |f(x)—f(y)| > «. For each n € N we 

choose Xn, Yn € K so that |an —Yyn| < — and |f(an)— f(yn)| > €. Since {x,} is bounded, by 
n 


the Bolzano-Weierstrass Theorem there is a subsequence {x} — p, where p € K because 
K is closed. Since {%n, — Yn,} + 0, we know that {yn,}— p. Since f is continuous, it 
follows that {(f(n,))} -* f(p) and {f(Im,)} > f(p). Therefore, {f(n,) — £(Uny)} — 0. 
This is impossible since (—e,€) excludes {f(@n,) — f(Yn,)} for all k © N. Hence, f is 
uniformly continuous. 


Here is an example of a uniformly continuous function. 


Example 4.3. Let f(x) = 5x. Prove f is uniformly continuous. 


Solution. Let € > 0. Let 6 = 7 If |x — y| < 6 then |f(x) — f(y)| = |5a — 5y| = 5la —y| < 


€ 
5(<) =e. Hence, f is uniformly continuous. 


84 CHAPTER 4. LIMITS AND CONTINUITY 


Uniform continuity is a useful property which is stronger than continuity. Continuity 
at a point p occurs when, for an arbitrary distance € > 0, it is the case that if points x 
are sufficiently close to p, that is to say with some distance 6, > 0 of p, their images f(x) 
are within distance ¢ of f(p). For a uniformly continuous function f, the choice of 5, can 
be made the same for every point p in the domain, so there is a uniform choice of 6 for a 
given choice of € (a choice that does not vary with the choice of point p in the domain of 
f). In the next section we will discuss derivatives, and in one of the exercises we address 
the fact that if the derivative is bounded for a differentiable function whose domain is an 
interval then the function is uniformly continuous. This is not an if and only if condition, 
however. There are functions whose derivatives are unbounded that are still uniformly 
continuous. The preceding theorem tells us that if we restrict a continuous function to a 
closed and bounded domain then it is always uniformly continuous, but this does not follow 
if the domain is not bounded, and it does not follow if the domain is not closed. We leave 
the case where the domain is bounded but not closed to one of the exercises. Below is an 
example of a case where a function is not uniformly continuous where the domain is closed 
but not bounded. 


Example 4.4. Let f(x) = x”. Then prove f is not uniformly continuous on R. 


6 6)? 1 
Solution. Let 6 > 0. We know that (a + a) —o* = ba + or > 6. Thus, if « > 5 then 
) 1 
\(x 4 aE x?| > 5° = 1, so f is not uniformly continuous. 


85 


Exercises: 


Exercise 4.1. Let f,g: D—R be functions so that lim f(x) = M and for some 6 > 0 it 
xL—->C 
is true that if |x —c| <6 anda € D then |g(x) — L| < |f(x) — M|. Then lim g(x) = L. 
xL—->C 


Exercise 4.2. Let f : N — R be defined by f(n) = 2», for alln € N. Then lim fy Shay. 
and only if {an} — L. 


Exercise 4.3. Let lim f(x) = L and lete > 0. If f(x) = g(x) for all x € dom(g)N (c— 
€,c+e) \{c} and c is a limit point of the domain of g then lim He de 


Exercise 4.4. Let f : D > R and let c be a limit point of D. Then lim f(x) = L if and 
xwL—->C 
only if lim f(x) -L=0. 
w->C 


Exercise 4.5. Let E CR. Then E is closed. 
Exercise 4.6. Prove Theorem 4.2. 


Exercise 4.7. Let f : [a,b] — [a,b] be continuous. Then there is a point c € [a,b] so that 


fle) =e. 
Exercise 4.8. Every polynomial is a continuous function. 


Exercise 4.9. Let f be continuous at a point c, where f(c) > 0. Show that there is 
a positive number 6 > 0 and a positive number M > 0 so that f(x) > M for all x € 
(c—6,c+6)Ndom(f). 


Exercise 4.10. Give an example, with proof, of a function with bounded domain which is 
continuous but not uniformly continuous. 


Exercise 4.11. Give an example, with proof, of a function with bounded range which is 
continuous but not uniformly continuous. For this example, you may assume that standard 
trigonometric functions and exponential and logarithmic functions are continuous on their 
domains (this will be shown in the next chapter). 


86 CHAPTER 4. LIMITS AND CONTINUITY 


Exercise 4.12. Let a > 0 then there is a unique number c > 0, the principal nth root of a, 
1 
having the property that c’ =a. We denote this number c= a7. 


Exercise 4.13. Let f : [a,b] + R be continuous. Then f([a,b]) is a closed interval. 


Exercise 4.14. Let {x,} be a Cauchy sequence in the domain of a uniformly continuous 
function f. Then {f(xn)} is a Cauchy sequence. 


Exercise 4.15. Let f : A > R be uniformly continuous, where A is bounded. Then f(A) 
is bounded. 


Exercise 4.16. Let f,g:D— R be functions with c a limit point of D, so that g is bounded 
on (c—€,c+€) for some € > 0 and lim F(a) = 0s Then lim Flxejgley = 0. 
x—7C xc 


Exercise 4.17. Let f be continuous on (a,b). Then f is uniformly continuous if and only 
if there is a continuous function g : [a,b] + R such that f(x) = g(x) for all x € (a,b). 


1 
Exercise 4.18. If lim f (2) = oo then ares =0. 


87 


Hints: 


Hint to Exercise 4.1. Let f,g : D > R be functions so that lim f(x) = M and for 
Sc 


some 6 > 0 it is true that if |x —c| < 6 and x € D then |g(x) — L| < |f(x) — M|. Then 
lim g(x) = L. 


xr—->C 


Try to use the definition of limit directly, picking a distance from c which is less than 6. 
Hint to Exercise 4.2. Let f : N > R be defined by f(n) = xp for alln € N. Then 
lim In = L if and only if {xn} > L. 
noo 


Write the definitions of convergence and limit at infinity and compare them. 


Hint to Exercise 4.3. Let lim f(x) = L and lete > 0. If f(x) = g(x) for all x € 
x—->C 
dom(g) N(c—€,ce+€) \ {c} and c is a limit point of the domain of g then lim Or) =. 
Write down what the definition of limit being equal to L is for both functions and 


compare the definitions. 


Hint to Exercise 4.4. Let f : D—R and let c be a limit point of D. Then lim Fp) =i 
au Cc 
if and only if lim f(a) —-L =0. 
zc 


Write the definition of each and compare them. Alternately, use the Sequential Characterization 
of Limits and theorem 3.3. 


Hint to Exercise 4.5. Let E CR. Then E is closed. 


Show that a limit point of EF is also a limit point of E. 


Hint to Exercise 4.6. Prove Theorem 4.2. 

Parallel the Sequential Characterization of Limits proof. 
Hint to Exercise 4.7. Let f : [a,b] — [a,b] be continuous. Then there is a point c € [a, }] 
so that f(c) =c. 


Use the Intermediate Value Theorem on h(x) = f(x) — x. 


Hint to Exercise 4.8. Every polynomial is a continuous function. 


Use induction and the fact that the product and sum of continuous functions is continuous. 


88 CHAPTER 4. LIMITS AND CONTINUITY 


Hint to Exercise 4.9. Let f be continuous at a point c, where f(c) > 0. Show that 
there is a positive number 6 > 0 and a positive number M > 0 so that f(x) > M for all 
xz € (c—6,c+ 6) dom(f). 


Use the definition of continuity to show that for some 6 > 0, for all x € (c—6,c+6)N 


dom(f) it is true that | f(a) — f(c)| < we 


Hint to Exercise 4.10. Give an example, with proof, of a function with bounded domain 
which is continuous but not uniformly continuous. 


Consider functions whose tangent line slopes approach infinity or negative infinity. 


Hint to Exercise 4.11. Give an example, with proof, of a function with bounded range 
which is continuous but not uniformly continuous. For this erample, you may assume that 
standard trigonometric functions and exponential and logarithmic functions are continuous 
on their domains (this will be shown in the next chapter). 


Consider functions that oscillate a lot. 


Hint to Exercise 4.12. Let a > 0 then there is a unique number c > 0, the principal nth 
1 
root of a, having the property that c” =a. We denote this number c= an. 


Use the Intermediate Value Theorem. 


Hint to Exercise 4.13. Let f : [a,b] > R be continuous. Then f(|a, 6]) is a closed interval. 


Use the Extreme Value Theorem and the Intermediate Value Theorem. 


Hint to Exercise 4.14. Let {x,} be a Cauchy sequence in the domain of a uniformly 
continuous function f. Then {f(an)} is a Cauchy sequence. 


Use the definitions of uniform continuity and Cauchy sequence. 


Hint to Exercise 4.15. Let f : A > R be uniformly continuous, where A is bounded. 
Then f(A) is bounded. 


Either use the fact that the uniformly continuous image of a Cauchy sequence is a 
Cauchy sequence or use a finite collection of points that are spaced so that every point in 
A is near one of them and the definition of uniform continuity. 


Hint to Exercise 4.16. Let f,g: D—R be functions with c a limit point of D, so that g 
is bounded on (c—€,c+ €) for some « > 0 and lim f(x) = 0. Then lim Tage) = 0: 
«wc x Cc 


You can use the definitions directly, or use the Sequential Characterization of Limits 
and Theorem 3.4. 


89 


Hint to Exercise 4.17. Let f be continuous on (a,b). Then f is uniformly continuous 
if and only if there is a continuous function g : [a,b] > R such that f(x) = g(x) for all 
x € (a,b). 


If you assume that f is uniformly continuous on (a,b) then take a sequence {z,} 
converging to b. Explain why {f(z,)} is a Cauchy sequence and define g(b) to be the 
point to which this sequence converges, and then prove that g is continuous at 0. 


1 
Hint to Exercise 4.18. Jf lim f(x) = co then lim F(x) =0. 
@~—Cc rc x 


Use the definitions, and the fact that the reciprocal of a sufficiently large number can 
be made as small as we wish. 


90 CHAPTER 4. LIMITS AND CONTINUITY 


Solutions: 


Solution to Exercise 4.1. Let f,g : D — R be functions so that lim f(x) = M and for 
Sc 


x 
some 6 > 0 it is true that if |x —c| < 6 and x € D then |g(x) — L| < |f(x) — M|. Then 
lim g(x) = L. 
Proof. Let ¢ > 0. Choose 0 < y < 6 so that if |e —c| < y and x € D then |f(x) — M| <e. 
Then since |x — c| < 6 it also follows that |g(x) — L| < |f(x) — M| < «. Hence, lim ay= 
x Cc. 
L. 


Solution to Exercise 4.2. Let f : N > R be defined by f(n) = ap for alln € N. Then 
lim In = L if and only if {xn} > L. 


Proof. Assume lim tn = L. Let « > 0. Then there is some B so that if n > B then 
noo 
|v, — L| < «. Choose k € N so that k > B. Then if n > k it follows that |x, — M| < € so 
{x} L. 
Assume {x,,} — L. Let ¢ > 0. We can choose k € N so that if n > k then |x, — L| < «, 
which means lim 2, = L. 
NOOO 


Solution to Exercise 4.3. Let lim f(x) = L and lete > 0. If f(x) = g(x) for all 
x € dom(g) MN (c—«€,c+e) \ {c} and c is a limit point of the domain of g then lim gaya, 


Proof. Let ¢, > 0. Choose 6 > 0 so that if 0 < |x —c| < 6 then |f(z) — L| < e if 
x € dom(f). Let 6; = min{e, 6}. Then if 0 < |x —c| < 6, and x € dom(g) then g(x) = f(z) 
so |g(z) — L| < 4. 


Solution to Exercise 4.4. Let f : D> R and let c be a limit point of D. Then lim fie) = 
a Cc 
L if and only if lim f(¢) = L=0. 
x Cc 


Proof. We say that lim f(x) = L if and only if for every € > 0 there is a 6 > 0 so that if 
a a 
0<|x—c| <6 and « € D then |f(x) —L| <e. 
We say lim f(x) — L = 0 if and only if for every « > 0 there is a 6 > 0 so that if 
x av 


0<|x—c| <6 and a € D then |f(x) -L—0| <e. 
Since these two statements are equivalent, the theorem follows. 


Solution to Exercise 4.5. Let E CR. Then E is closed. 


91 


Proof. Let p be a limit point of EF and let (a,b) be an open interval containing p. Then 
(a,b) 0 E contains a point q 4 p. If qg € (EF \ E) then q is a limit point of E, which means 
that (a,b) contains infinitely many points of E, so we can find a point z € (a,b) N(E \ {p}). 
Since every open interval containing p contains a point of EF’ distinct from p, we know that 
p is a limit point of E, so p € E. Hence, EF contains all of its limit points and is therefore 
closed. 


Solution to Exercise 4.6. Prove Theorem 4.2. 


Proof. First, assume that f is continuous at c. Let {x,} C D such that {x,} > c. Let 
€ > 0. Then for some 6 > 0, we know that if |” — c| < 6 and x € D then |f(x) — f(c)| <e. 
Since {x,} > c, we can find N EN so that if n > N then |x, — c| < 6, so ifn > N then 
|2n — c| < 6 which implies that |f(z,) — f(c)| < €, so {f(an)} > f(c). 

Next, assume that for every sequence {x,} C D, if {z,} > c then {f(x,)} > f(c). 
Suppose that f is not continuous at c. Then we can find an € > 0 so that for every 6 > 0 
there is some x € D so that |x — c| < 6 but |f(x) — f(c)| > e. For each n € N we choose 


1 1 
Ln € D so that |x, —c| < — and |f(r,)— f(c)| > «. Since c— — < a, < c+ —, we know by 
n n 


n 
the Squeeze Theorem that {rp} —> c. But {f(xn)} A f(c), contradicting our assumption. 
Thus, f is continuous at c. 


Solution to Exercise 4.7. Let f : [a,b] — [a,b] be continuous. Then there is a point 
c € [a,b] so that f(c) = 


Proof. If f(a) =a or f(b) = 6 then we have a fixed point as desired. Assume f(a) > a and 
f(b) < b. Let h(x) = f(x) — ax. Since x is continuous and f is continuous, we know that h 
is continuous. Also, h(a) > 0 > h(b), so by the Intermediate Value Theorem there is some 

€ (a,b) so that h(c) = 0 which means that f(c) = 


Solution to Exercise 4.8. Every polynomial is a continuous function. 


Proof. We will proceed by induction on the degree of the polynomial. Let « > 0. We know 
a degree one polynomial f(«) = ax + b is continuous by Theorem 4.3 and Theorem 4.7. 
Assume that all polynomials of degree k are continuous. Let P(x) = ayaa tt +agac* + 
... $a,2+a9 be a polynomial of degree k+1. Then we know that Apia, x, and apx” +...+ 
a,x + ag are each continuous from the induction hypothesis, so by theorem 4.7 we conclude 
that P(x) is continuous since P(x) = (ap4,2")(x) + (agv® + ... + ax + ag). By induction, 
the result follows. 


Solution to Exercise 4.9. Let f be continuous at a point c, where f(c) > 0. Show that 
there is a positive number 6 > 0 and a positive number M > 0 so that f(x) > M for all 
x € (c—6,c+ 6) Ndom(f). 


92 CHAPTER 4. LIMITS AND CONTINUITY 


Proof. Choose 6 > 0 so that if |r—c| < 6 and x € dom(f) then | f(x)—f(c)| < a Then by 
Theorem 1.17, we know that f(x) > fo - £9 = Ho for all x € (c—6,c+6)Ndom(f). 


Solution to Exercise 4.10. Give an example, with proof, of a function with bounded 
domain which is continuous but not uniformly continuous. 


1 
Proof. Let f(x) = — on (0,1). Then f is continuous ue Theorem 4.7, but if we set « = 1 
x 


aq! <6, bot |F()- I= 


and choose any 6 > 0 we can age i < 6 and then E Eat 


ko a +1 
|k +1—k| =1. Thus, f is not uniformly continuous. 


Solution to Exercise 4.11. Give an example, with proof, of a function with bounded range 
which is continuous but not uniformly continuous. For this example, you may assume that 
standard trigonometric functions and exponential and logarithmic functions are continuous 
on their domains (this will be shown in the next chapter). 


1 
Proof. Let f(x) = sin(—) on (0,1). We know f is a composition of a continuous function 


and a ratio of two continuous functions and is continuous by Theorems 4.12 and 4.7. For 


1 1 1 
any 6 > 0 we can choose —— < 6 so that | = aq | = 0; but |f(——#)'= 
27n 2mn+ 5 2an+ a 2mn + 5 
1 
f(——— 37)! = |1 — (-1)| = 2. Thus, f is not uniformly continuous. 
2an + 


Solution to Exercise 4.12. Let a > 0 then there is a unique number c > 0, the nth root 
of a, having the property that c” =a. 


Proof. Let f(x) = x". Then we know f is continuous by Exercise 4.8. Also, f(0) =0 <a 
and (a+ 1)” =1+na+...+a” >a by the Binomial Theorem. Hence, by the Intermediate 
Value Theorem, for some c € (0,a+ 1) it follows that f(c) = a, or in other words, c” = a. 
The fact that this number is unique follows from Exercise 2.13 which shows that f(x) is 
increasing on [0,00) and therefore one to one. 


Solution to Exercise 4.13. Let f : [a,b] > R be continuous. Then f([a,b]) is a closed 
interval. 


Proof. By the Extreme Value Theorem there are points s,t € [a,b] so that f(s) < f(x) < 

f(t) for all x € [a,b]. By the Intermediate Value Theorem, for any k € (f(s), f(t)) there is 
a point c between s and t so that f(c) =k. Hence, the set of points in f([a,b]) is exactly 
the interval [f(s), f(t)]. 


Solution to Exercise 4.14. Let {x,,} be a Cauchy sequence in the domain of a uniformly 
continuous function f. Then {f(an)} is a Cauchy sequence. 


93 


Proof. Let € > 0. Choose 6 > 0 so that if |~—y| < d and x,y € dom(f) then | f(~)—f(y)| < e. 
Choose k € N so that if n,m > k then |z, — 2,| < 6. Then if n,m > k we know that 
|f(an) — f(@m)| < €, so {f(@n)} is a Cauchy sequence. 


Solution to Exercise 4.15. Let f : A — R be uniformly continuous, where A is bounded. 
Then f(A) is bounded. 


Proof. Suppose f is not bounded. Then for each n € N we can choose x € A so that 
|f(x)| > n. Since A is bounded we know {2,,} is bounded and has a convergent subsequence 
{£n,}, 80 {@n, } is a Cauchy sequence. By Exercise 4.14, this means that { f(xn,)} is a Cauchy 
sequence, but this is impossible since {f(xp,)} is not bounded because for any M > 0 we 
can find k € N so that k > M, so |f(an,)| > ng > k > M by Theorem 3.12. Hence, f(A) is 
bounded. 


Solution to Exercise 4.16. Let f,g :D—R be functions with c a limit point of D, so 
that g is bounded on (c—e€,c+e) for some € > 0 and lim f(x) =0. Then lim f(x)g(x) = 0. 
«wc wL—->C 


Proof. Let {x,} — c where {z,} C D \ {c}. Then by the Sequential Characterization of 
Limits we know that {f(z,,)} > 0. Choose k € N so that ifn > k then rz, € (c— €,c+€). 
Then by Theorem 3.4, we know that {f(tniz)9(@n+iz)} 4 0, so by Theorem 3.26 we know 
that {f(@n)g(an)} > 0. Thus, by the Sequential Characterization of Limits again, we know 
that Tim Fxajge=O 


Alternate proof: 


Proof. Since g is bounded we can find M > 0 so that |g(x)| < M for all a € DN(c—e,c+e). 


Let €; > 0. Choose 0 < 6 < € so that if 0 < ja —c| < 6 and x € D then |f(z)| < a If 


0 < |x —cl <6 and az € D then |g(x) f(x)| < <M = €1, which means that lim f(egle2)= 
0. 


Solution to Exercise 4.17. Let f be continuous on (a,b). Then f is uniformly continuous 
if and only if there is a continuous function g : [a,b] > R such that f(x) = g(x) for all 
x € (a,5). 


Proof. First, assume that g exists. Then since g is continuous on [a,b] we know that g is 
uniformly continuous, which implies that f is uniformly continuous. 

Next, assume that f is uniformly continuous. Let {r,}— a, where {x,} C (a,b). Then 
{x,} is a Cauchy sequence, so { f(x,,)} is a Cauchy sequence by Exercise 4.14, which means 
that {f(n)} converges to a point which we will define to be g(a) (by Theorem 3.25). 

Let {yn} C (a,b) so that {y,} — a. Let « > 0. Since f is uniformly continuous, we can 
choose 6 > 0 so that if |a — y| < 6 then |f(x) — f(y)| < «. Since {z,} > a and {yn} > a 


94 CHAPTER 4. LIMITS AND CONTINUITY 


we know that {x,, — yn} — 0. Thus, we can find k € N so that ifn > k then |x — yn| < 4, 
so |f (an) — f(yn)| < €. Hence, {f(@n) — f(yn)} > 0, which means that {f(yn)} — g(a) by 
Exercise 4.10. This implies that iim f(x) = L by the Sequential Characterization of Limits, 
so g is continuous at a. We similarly assign g(b) by using the same process (replace a by 
b in the preceding argument). Hence, g(a) = f(x) for x € (a,b), with g(a) and g(b) thus 
defined, is continuous. 


1 
Solution to Exercise 4.18. /f lim f(x) = 00 then lim —~ =0. 
«wc 


ae f(x) 


Proof. Let € > 0. Since lim f(x) = 00, we can choose 6 > 0 so that if 0 < |x —c| < 6 then 
x Cc 


1 1 1 
f(x) > — , which means that 0 < ——~ < . and thus lim — =0. 
€ saad 


f(@) f(x) 


Chapter 5 


Differentiation 


Definition 30 


Let f : D— R and let xp € D°. We say that y = f(x) is differentiable at xo if 
i F(zo +h) — f (zo) 
1m 
h-0 h 


exists, in which case we call this limit the derivative of f at 


d 
xo, denoted f’(xo) or aay 
We let ye denote (f’(x))’, and let f’”(x) = (f”(x))' and so on. We also use the 
notation f(x) to denote the ith derivative of f at x (in other words, we define 
f(x) recursively by stating that f(x) = (f—-)(z))! for natural numbers i > 1, 
where f(x) = f'(2)). 


Note that if a function is defined on a point in the interior of its domain then the 
definition of limit and continuity of a function can omit the ”and x € D” portion of the 
definition because by choosing 6 small enough, it is guaranteed that all x so that |~—c| <6 
will be in the domain of the function. For a function which is differentiable at a point p, 
the point p is always in the interior of the domain. 


Theorem 5.1. For any set D it is true that x9 € D®° if and only if xo € (a,b) C D for 
some open interval (a, b). 


Proof. We know that x9 € D°® if and only if zo is contained in the open interval (xo — €, 79 + 
€) C D for some € > 0. If ao € (a,b) C D for some open interval (a,b) then since (a, b) is 
an open set by Theorem 3.2, we can find € > 0 so that xp € (4p — €,2%9 + €) C (a,b) C D 
which means that x € D°. 


Theorem 5.2. Let f : (a,b) + R and let c € (a,b). Then f'(c) exists if and only if 


wL—C 


Proof. We know f'(c) = lim 
= 


so that if 0 < |h —0| ee then! 


if and only if for every « > 0 there isa 6 > 0 


(c+h)— fo) 
h 


fle+h) — fle) 
h 


f'(c)| < € which is true if and only if for 


95 


96 CHAPTER 5. DIFFERENTIATION 


(c+xz-—c) — fic) 


Ges Ea OMS ORR OS Ea Se ee tl eee 
which is true if and only if f’(c) = lim F(z) = Flo) 
me LC 


Theorem 5.3. Let f : (a,b) > R and let c € (a,b). If f is differentiable at c then f is 
continuous at c. 


(x —c) = f'(c)(0) = 0, it follows that 


Proof. Since Tim f(x) — f(c) = lim 


lim f(x) = f(c), so f is continuous at c. 
x—->C 


There are other ways of looking at differentiability that are sometimes instructive. 


Theorem 5.4. Let f(x) be defined on an open interval containing x9. Then f is differentiable 
at xo if and only if there is a function F : dom(f) > R so that F is continuous at xo and 
f(x) = F(x)(x — x9) + f (20) for all x € dom(f), in which case F(xo) = f' (x0). 


Proof. First, assume that f’(o) exists. Define F(x) = f(x) — Fo) if  # xq and define 


xZ— Xo 
F(zo) = f’(zo). Since jim Bee = jim Me) Lao} = f'(xo) = F(xo) we know that 


F is continuous at zo. For x € dom(f) \ {xo} we have that F(x)(x — xo) + f(ao) = 
Ee) ~ 0) (o—99)+ Flo) = fe). For a = a9, we see F(2)(2—20)+ Flo) = F'(@0)(0)-+ 


L—-2 
f (xo) = f(x) as well. Thus, f(x) = F(x)(a — x0) + f(xo) for all x € dom(f). 

Next, assume that there is a function F' : dom(f) + R so that F is continuous at xo 
and f(x) = F(«x)(« — xo) + f(xo) for all x € dom(f). Solving for F', we get that F(x) = 
f(x) = f(%0) 


LL — LH 


if « # x. Since F' is continuous at x9 we know that lim Pie): Pia) 
I-20 


which means that lim f(x) = f(xo) 
LLOQ 0 x0 
differentiable at xo. 


= F (xo), so by definition F(x) = f’(xo) and f is 


Theorem 5.5. Let f(x) be defined on an open interval containing x9. Then f is differentiable 
at xo if and only if there is a function e(Ax) defined for all Ax so that x9 + Ax € dom(f), 


A 
and a constant k, so that Jim, Co) = 0 and f(a + Az) — f(zo) = KAx + (Az) for all 
x“ 


x €dom(f), in which case k = f'(x0). 


Proof. First, assume that f is differentiable at xp. Then set «(Axr) = f(a + Ax) — 
A Ax) — — f'(xo)A 
f(xo) — f'(xo)Ax we have that lim aa?) = lim {ito F Ba) = 1\20)— fF (to) Ae = 


x Az->0 Ax Aaz—>0 Ax 
Jim, Heat Ae) Fle) _ p99) = $"(e0) — feo) = 0 
xw—> xv 


97 


Next, assume that there is a function «(Az) defined for all Ax so that xr + Ax € 


A 
dom(f), and a constant k, so that lim qAz) = 0 and f(xp + Az) — f(x) = k(x -— 
Ar3>0 Ag 


xo) + e(Ax) for all x € dom(f). Then, solving for «(Ar) = f(x + Ax) — f(a) - 


A Ar) — 
f'(xo) Az, we know that Jim, (a2) =0; so.= Jim, F(eo+ ‘) L(%o) f'(xo) = 9, 
5 2 ALGO PA) = flay oy 
ac, a. a 


Sometimes it is also helpful to rephrase the preceding theorem as follows, depending on 
the context. 


Theorem 5.6. Let f(x) be defined on an open interval containing x9. Then f is differentiable 
at xo if and only if there is a function 6(Ax) defined for all Ax so that x9 + Ax € dom(f), 
and a constant k, so that Jim 0(A) = 0 and f(xp + Ax) — f(ap) — kAx = 6(Ax)Ax for 
Od 
f(zo + Ax) — f(xo) — kAx 
Ax 


all x € dom(f) (or, in other words, = 6(Azx)), in which case 


k= f'(xo). 


A 
Proof. Set 6(Ax) = aC z) in Theorem 5.5 and the result follows. 
fs 
The following is another way to look at differentiation in a more geometric context. 


Theorem 5.7. Let f(x) be defined on an open interval containing x9. Then f is differentiable 
at xo if and only if there is a number m so that for any numbers c < m < d, there is a 

6 >0 so that if |x — x9| < 6 then f(x) is between or equal to c(x) = c(x — x9) + f (xo) and 

d(x) = d(x — x9) + f (xo), in which case m = f'(x0). 


Proof. Assume f’(xq) exists and set m = f'(zo). Let € = min{m —c,d—m}. Choose 6 > 0 


f(x) — f(xo) f(x) — f(xo) 


£ L-2£ 

then c(x — 20) < f(x) — f(xo) < d(x se xo), which means that c(a — i) + f(xo) < f(z) < 
d(x — xo) + f(xo), and if x < xp then c(x — xp) + f(xo) > f(x) > d(a — x0) + f (Xo). 

Assume there is a number m so that for any numbers c < m < d, there is a 6 > 0 

so that if |z — zo| < 6 then f(x) is between or equal to c(a) = c(a — xo) + f(xo) and 

d(x) = d(x — x9) + f(xzo). Let € > 0. Choose c = m—e andd=m-+e. Then there is a 

6 > 0 so that if |x — x9| < 6 then f(x) is between or equal to c(x) = c(a — x0) + f (xo) and 

d(x) = d(x—2o0)+ f (xo). If x > x then c(a—2x09) < d(a—29), so c(x— 20) + f (a0) < f(x) < 

f(x) — f(20) f(x) — f (xo) 


so that if | — xo| < 6 then | m| <€,soc< Se At eG 


d(x — xo) + f(ao), and m-—e < <m+e,so -€< m<e, 
x — XO xv — XO 
which means that fe) = Fao) —m|<e. Similarly, if ¢ < xo then c(4% — x) > d(x — 29), 
xL— XO 
so c(a— 20) + f (a0) > f(x) > d(x — 29) + f(xo), which means —e < Ae) — (eo) m<e, 
— x0 
) ) f(a) = Feo) — fo) _ m|<e. Hence, lim F(z) ~ F(wo) _ m= f' (x0): 


L— LO xL—->XLO xL— Xo 


98 CHAPTER 5. DIFFERENTIATION 


The last of these forms a derivative helps us to picture a derivative visually. A function 
is differentiable at a point (c, f(c)) if there is a tangent line with slope f’(c) through that 
point so that if you take a line of greater slope though (c, f(c)) and a line of smaller slope 
through (c, f(c)) then there is a 6 > 0 distance about c so that all the points of the graph 
of y = f(x) that have x values between c— 06 and c+ 6 have corresponding y values between 
the two lines given. Sometimes this is thought of in terms of angles. If you take any double 
cone of any positive angle between the lines forming the cone with vertex at (c, f(c)) where 
the tangent line bisects the angle, then a vertical band about x = c which is sufficiently 
narrow will have all points in the graph within that band sandwiched inside the double 
cone. In other words, a differentiable function is approximately flat near a point on the 
curve with z-value at a point where the function is differentiable. If you were to zoom in on 
a portion of the curve of sufficiently small diameter then the magnified image would look 
more and more like the tangent line. 


Differentiability at c 


Magnifying the region near (c, f(c)) in the picture above, the graph looks like this. 


99 


Magnified Graph near c 


(c, f(o) 


In contrast, f(a) = |x| is not differentiable at the origin. If you were to zoom in on 
the vertex of the graph it would always look like a V shape, no matter how much you 
magnified the image. The graph would never look flat and linear because the function is 
not differentiable. 


100 CHAPTER 5. DIFFERENTIATION 


Graph of y = |z| 
6 mn” 


y 


101 


Magnifying Graph of y = |a| Near Origin 
0.3 Ty 


202 2015s 0.0 8. 92 B-19-2 O01 0.15 0.2 


851072 


Theorem 5.8. Let f,g be differentiable at c. Then there is an open interval (a,b) so that 
€ (a,b) C dom(f) Ndom(g). 

Proof. By the definition of differentiable there €,,¢€: > 0 so that (c — €1,c + 4) 

and (c — €2,c + €2) C dom(g). Letting « = min(e1, €2) we see that (c—¢,c +e) C 

dom(q). 


C dom(f) 
dom(f) A 


Theorem 5.9. Let f,g be differentiable at x. Then: 
(a) (f +9)'(2) = fi(a )+9'(2) 
(b) (f9)'(x) = f(x)g'(x) + g(x) f'(2) 


(c) Irgle) #0 then (Z'(a) = SOLE Foe) 


Proof. By Theorem 5.8 we know that x is contained in an open interval contained in the 


domains of f +g and fg, and by Theorem 4.5, if g(x) 4 0 then f is also defined on an open 
g 


interval containing 2. 


(a) (f + g)(2) = lim fla +h) + 9(@ on) =(f(ey4-g(e)) 


hm Lt +h) = g(@) 
h—0 h 


_ tn FO+H- FO) | 
h-0 h 


= f'(x) + g'(x) by the sum rule for limits. 


102 CHAPTER 5. DIFFERENTIATION 


0) Ga/@) Sin St er 


fle t+h)g(@ +h) — g(a +h) fw) +9 +h) f(@) — f(a)g(@) 
h 


fle +h) — f(z) 
h 


lim 
h-0 


= |i h 
lim g(a + h) 


g(a +h) — g(x) _ f(a)g'(x) + g(x) f"(x) by 


+ lim f(x) 


and Theorems 4.6, 4.11 and 4.4. : 
fice = im HED TID — 4, fet Male) ~ fla)olw +h) 
() () Peay h ~ ha¥0 xt+h)g(x)h 
“a 2 Sle + hala) = Fadale) + eal a) — Fleale +) 
h>0 g(a + h)g(x h 
: 1 fla+h)-f) 1 g(a + h) — g(a) 
sa (2) h ih eee h 


= ) by Theorems 4.6, 4.11 and 4.4, assuming that g(x) 4 0. 


Theorem 5.10. If f(x) =x (on some open interval I) then f'(x) =1 for each x € I. If 
f(x) =k (one some open interval) then f'(x) =0 for each x € I. 


h- k—k 
Proof. J, a = 1 and Hus he = 0 by Theorem 4.3. 


Theorem 5.11. If f is differentiable at xo and g is differentiable at f(xo) then go f is 
defined on an open interval containing xo. 


The proof of this theorem is left as an exercise. 


For most examples the chain rule is a straightforward idea. If you are traveling at 10 
miles per hour currently, and you are painting a road requiring one half of one kilogram of 
paint per mile then you are using 5 kilograms of paint per hour. In this manner, if you have 
p = p(s) is paint as a function of displacement and s = s(t) is displacement as a function of 

: dpds dp 
time then —— = — 


dsdt dt 
distance changes per unit of time is the amount that paint changes per unit of time. This is 


the idea behind the chain rule. Some people think of it as cancelling the ds in the fraction 


The amount paint changes per unit of distance times the amount 


above. This, of course, is nonsense, because the symbol ge refers to a limit of a difference 
quotient ratio of change in position over change in time, and not an actual fraction. There 
are ways of interpreting decimals as fractions of differentials and ways of formalizing the 
operations that correspond to what would have been such symbol cancellations which make 
sense, but these would have to be developed (and generally follow from the chain rule). 
Even so, the idea of the chain rule corresponds to the idea of multiplying rates of change 
as described. 


103 


We begin with an inadequate proof of the chain rule which is simple. We include it 
because it is correct in the vast majority of cases and it is easy to understand. With the 
restrictions we place on the function, the proof is correct. 


Theorem 5.12. Special Case of Chain Rule. Let f be differentiable at some point xg so 
that for some € > 0 it is true that f(x) 4 f(x) for allx # xo so that x € (ap — €, 20 + €). 
Let g be differentiable at f(xo). Then (go f)'(xo) = g'(f(x0)) f’ (x0). 


Proof. We set y = f(x) and yo = f(xo). Then for |h| < € we have Jim 9(F(«)) — 9(F (0) _ 


vO zr — XO 
im 264) — 90) ¥= Yo _ 4, Gy) — 9(¥o) |, Fe) = Fo) 
tr Y-Yo L—-% you Yy-Yy «te LAH 
by Theorem 4.11 since g is continuous at yo and f(x) — f(xo) 4 0. 


This is not the full form of the chain rule. It is appealing, and works for most functions. 


(f(x) — 9(f (0) 


= 9 (yo) f (ao) = 9’ (Ff (20)) f' (x0) 


To prove the chain rule properly, however, we need to deal with the fact that ul 
L— Lh 


g(y) — g(yo) ¥ = Yo 
¥Y—Yo &L— LO 

arbitrarily close to x9 potentially. This can be done by using the form of differentiability 

discussed in Theorem 5.4, which essentially creates a continuous function to fill in the holes 

in the cases where y = yo. 


cannot be written as 


when y = yo, which could happen at points 


Theorem 5.13. Chain Rule. If f is differentiable at a and g is differentiable at f(a) then 
(g° f)'(a) =9'(F(a)) f'(a). 
Proof. We know go f is defined on an open interval containing a by Theorem 5.11. 

By Theorem 5.4, there is a function G : dom(g) — R which is continuous at f(a) so 
that g(x) = G(x)(x — f(a)) + g(f(@)) for all x € dom(g) and G(f(a)) = g'(F (a). 

We can write (go f)'(a) = lim AF(2)) = (Fla) _ 


SE wa 


jam GLA) = F(a) + gl F@)) = HF@) _ 


ra —a 


lim (f(a) FO=f®) = G(f(a)) f’(a) = 9 (f(a) f’(a) since G is continuous at f(a). 


wa x—-a 


Definition 31 


Let f be a real valued function defined at a point c. We say that (c, f(c)) is a local 
maximum for f if there is an € > 0 so that (c—e,c+e) C dom(f) and f(x) < f(c) 


for each x € (c—e€,c+e). We say that (c, f(c)) is a local minimum for f if there is 
an € > 0 so that for each z € (c—e,c+€) it is true that f(x) > f(c). If (c, f(c)) isa 
local maximum or a local minimum then (c, f(c)) is a local extremum. 


104 CHAPTER 5. DIFFERENTIATION 


Theorem 5.14. Fermat’s Theorem. Let f : (a,b) > R be differentiable and let (t, f(t)) be 
a local extremum for f. Then f'(t) =0. 


Proof. First, assume that (t, f(t)) is a local maximum. Then there is an € > 0 so that 
(t—e,t+ 2) Cc dom(f) and f(t) > f(x) for all « € (t —e,t + 6). Choose sequences 
{an} C (t—e,t) and {yn} C (t,t + €) which both converge to t. For each n € N we know 


Men) 1) > 0 and Live) 10) < 0, so by the Comparison Theorem it follows that 
fi I eg POI IO 6 siden A ai TO 6 a, 
aot c—t noo In —t ast c—t noo Yn —t 

li Pa) IY) = 0 by the Sequential Characterization of Limits. 


1m 
at ea—t 


If (t, f(¢)) is a local minimum then there is an € > 0 so that (t— ¢,t+e) C dom(f) and 
f(t) < f(x) so —f(t) > —f(x) for all x € (t—e,t+e). Thus, (t,—f(t)) is a local maximum 
for the function —f, so —f’(t) = 0 and hence f’(t) = 0 


Theorem 5.15. Rolle’s Theorem. If f is continuous on [a,b] and differentiable on (a,b) 
and f(a) = f(b) then there is a point c € (a,b) so that f'(c) = 0. 


Proof. By the Extreme Value Theorem we can find s,t € [a,b] so that f(s) < f(x) < f(#) 
for all x € [a,b]. If f(s) = f(t) = f(a) then f is constant so f’(x) = 0 for all x € (a,b). 
Otherwise, f(t) > f(a) and t € (a,b) or f(s) < f(a) and s € (a,b). If s € (a,b) then 
(s, f(s)) is a local minimum for f, so f’(s) = 0 by Fermat’s Theorem. If t € (a,b) then 
(t, f(t)) is a local maximum for f, so f’(t) =0 by Fermat’s Theorem. The result follows. 


105 


Illustration of Rolle’s Theorem 


y 


1.8 } 


(t, F(t) 


12 -1 —08 —06 —04 —0.2 02 04 06 O08 1 12 


Theorem 5.16. The Cauchy Mean Value Theorem. Let f and g be continuous on [a,b] 
and differentiable on (a,b). Then there is a point c € (a,b) so that f’(c)(g(b) — g(a)) = 
g'(e)(f(0) — f(a). 


Proof. Let h(a) = f(x)(g() — g(a)) — g(#)(F() 
g(a) f(b) and h'(x) = f'(x)(g(b) — g(a)) — g(x) 


(a)). Then h(a) = h(b) = f(a)g() — 

) — f(a)) for all x € (a,b) and h is 

ec € (a,b) so that h'(c) = f’(c)(g(b) — 
9 (c)(F(b) — f(a). 


=. 
f(b 
continuous on [a, b]. By Rolle’s Theorem, ae is som 


g(a) — 9'(o)(F(b) — F(@)) = 9, 80 f"(e)(g(b) — g(a)) = 


The Cauchy Mean Value Theorem is also referred to as the Generalized Mean Value 
Theorem. 


Theorem 5.17. The Mean Value Theorem. Let f be continuous on |a,b| and differentiable 
n (a,b). Then there is a point c € (a,b) so that f'(c)(b— a) = f(b) — f(a). 


Proof. Setting g(x) = x in the Cauchy Mean Value Theorem, we know that there is a point 
c € (a,b) so that f’(c)(g(b) — 9(a)) = 9 ()(F(0) — f(a), 80 f’(e)(b — a) = f (0) — fla). 


106 CHAPTER 5. DIFFERENTIATION 


While the preceding proof is brief, it has less geometric intuitive appeal than the 
following argument. 


Proof. Let I(x) = £0) — Ha) (x—a)+ f(a), the line through (a, f(a)) and (b, f(b)). Then if 
h(x) = f(x) — L(x) we note that h(a) = 0 = h(b), so by Rolle’s Theorem, for some c € (a, b) 
we know that h’(c) = 0. Thus, f’(c) =U(c) = i Earios 


Illustration of Mean Value Theorem 


The Mean Value Theorem can be used to prove many other theorems. Here are a few 
of them. 


Theorem 5.18. Let f be continuous on [a,b] and differentiable on (a,b). If f'(x) > 0 for 
all x € (a,b) then f is increasing on |a, }}. 

Proof. Let c,d € [a,b] where c < d. Then by the Mean Value Theorem there is a point 
t € (c,d) so that f’(t)(d—c) = f(d) — f(c). Since f’(t) > 0 it follows that f(d) — f(c) > 0 
so f(d) > f(c) and f is increasing. 


107 


Theorem 5.19. Let f be continuous on [a,b] and differentiable on (a,b). If f’(x) <0 for 
all x € (a,b) then f is decreasing on [a,b]. 


We leave the proof of this theorem as an exercise for the reader. 


Theorem 5.20. Let f,g be continuous on [a,b] and differentiable on (a,b). If f'(x) = g'(x) 
for all x € (a,b) then f(x) = g(x) +k for all x € [a,b] for some constant k. 


Proof. Let z € (a,| and define h(x) = f(x) — 
there is a point c € (a, z] so that 0 = h'(c)(z — a) = h(z) — h(a). Thus, A(z) = h(a), so 
f(z) — 9(2) = F(a) — g(a), so f(z) = g(z) + (F(a) 


g(x). Then by the Mean Value Theorem 
) 
— g(a)) for every point z € [a, b]. 


Note that a special case of the preceding theorem is that if a function has a derivative 
of zero on an interval then the function is constant. 


There are many other theorems can be proven with the Mean Value Theorem. Normally, 
we guess the Mean Value Theorem might be helpful if we know things about the derivative 
of a function and we want to know things about how function values at different points 
compare with one another. Identifying that a theorem can be phrased in those terms is a 
first step to using the Mean Value Theorem. Here are a couple of additional examples. 


Example 5.1. Let f’(x) > 5 on R and let f(2) =4. Prove f(4) > 14. 
Solution. Since f is differentiable on R, f is differentiable and therefore also continuous on 


[2,4]. Thus, by the Mean Value Theorem, there is a point c € (2,4) so that f’(c)(4 — 2) = 
f(4) — f(2) = f (4) — 4. Since f’(c) > 5 we know that 2(5) < f(4) — 4, so f(4) > 14. 


Example 5.2. Prove | cos(b) — cos(a)| < |b — a] for all real a < b. 


Solution. Since cos(x) is differentiable and therefore continuous everywhere with derivative 
(cos(a))! = — sin(x) by Exercise 5.6, by the Mean Value Theorem we can find some c € (a, b) 
so that —sin(c)(b — a) = cos(b) — cos(a), so |sin(c)||b — al = |cos(b) — cos(a)|. Since 
| sin(c)| < 1 it follows that |b — a| > | cos(b) — cos(a)|. 


Example 5.3. Let f’(x) < g(x) for all x € R and let f(0) = g(0) =0. Then f(x) < g(x) 
for allx > 0 and f(x) > g(x) for all x <0. 


Solution. Let h(x) = g(x)—f (ax). Then h'(x) = g'(x)—f'(x) > 0 for all a € R. By the Mean 
Value Theorem, if x > 0 we can choose c € (0,2) so that h’(c)(a — 0) = h(x) — h(0) = h(z). 
Since h’(c) > 0 we know that h(x) > 0 so g(x) — f(x) > 0 and g(x) > f(x). Likewise, if 
x <0 we can find c € (2,0) so that h’(c)(2 — 0) = h(x) — h(0) = h(x). Since h’(c) > 0 we 
know that h(x) < 0 so g(x) — f(x) < 0 and g(x) < f(z). 


108 CHAPTER 5. DIFFERENTIATION 


As an alternate approach, we could have used the fact that h was increasing in the last 
example. 


Theorem 5.21. Let f’(x) 40 on (a,b), and let f be continuous on [a,b]. Then f is one 
to one on {a, bj. 


Proof. Let a < x1 < £2 < b. Then by the Mean Value Theorem we can find c € (a,b) so 
that f’(c)(x2 — 21) = f(x1) — f(x2). Since f’(c) 4 0 we know that f(x1) — f(x2) 4 0, so 
f(x1) 4 f(x2). Thus, f is one to one on [a, }]. 


Fermat’s Theorem and the preceding theorems about increasing and decreasing functions 
are also helpful for providing tests to find extrema. 


Theorem 5.22. The First Derivative Test. Let f be differentiable on (a,b) and let c € 
(a,b). If f'(x) <0 on (a,c) and f'(x) > 0 on (e,b) then (ce, f(c)) is a local minimum for 
f. If f'(x) > 0 on (a,c) and f'(x) < 0 on (c,b) then (c, f(c)) is a local maximum for f. 
If f'(x) > 0 on (a,c) and f'(x) > 0 on (ec,b) or f’(x) <0 on (a,c) and f'(x) <0 on (c,b) 
then (c, f(c)) is a saddle point for f. 


Proof. Since (a,b) is open, we can find € > 0 so that (c — ¢,c + €) C (a,b), so if f(x) < 0 
on (a,c) and f’(x) > 0 on (c,b) then f is non-increasing on [ce — e,c] (by Theorem 5.19) 
and non-decreasing on [c,c + €] (by Theorem 5.18), which means that if c—e <a <c then 
f(x) > f(c), and ife <a<c+e then f(x) > f(c), so (c, f(c)) is a local minimum for f. 
Similarly, if f(a) > 0 on (a,c) and f’(x) < 0 on (c,d) then f is non-decreasing on [c — e, c] 
and non-increasing on [c,c + €], which means that if c—e« < «<c then f(x) < f(c), and if 
c<a<c+ethen f(x) < f(c), so (c, f(c)) is a local maximum for f. 

Finally, if f’(x) > 0 on (a,c) and f’(x) > 0 on (c,b) then f is increasing on both [a,c] 
and |[c, 6] which means that f(c— 5) < f(c) < f(et+ 5) which means that (c, f(c)) is neither 
a local minimum nor a local maximum, so (2, f(c)) is a saddle point. Similarly, f’(2) < 0 on 
(a,c) and f’(x) < 0 on (e,b) then f is decreasing on (c—e, c+€), so f(c-5) > fe) > F(e+5), 
which means that (c, f(c)) is neither a local minimum nor a local maximum, so (2, f(c)) is 
a saddle point. 


Most of the time the process of finding local extrema using the first derivative test 
involves first differentiating f and then setting f’(z) = 0 and solving. You should also 
identify the points at which f(z) is undefined. This gives you a collection of points 
@1,2Q,...,£n called critical points, where the derivative is either zero or undefined. You 
then test a value between consecutive critical points (or preceding the first critical point or 
exceeding the last critical point). It is a consequence of the Intermediate Value Theorem 
that if f’(2) is continuous on an interval then it cannot change sign without taking on the 
value zero, so you can normally conclude that the sign of f’(2) is the same for all points 
between any two critical points. The first derivative test then tells you which of the critical 
points represent points at which you get extrema and which represent saddle points. This 


109 


only works for functions that are continuous on an interval and differentiable except at 
finitely many points. 


Example 5.4. Let f(x) = 2x° — 6x. Find all local extrema and saddle points. 


Solution: f’(x) = 6x? — 6 which exists for all values of x, so we set 6x? — 6 = 0 and get 
6(a—1)(x+1) = 0 so we have —1, 1 as critical points. Plugging in values x = —2,4 = 0,4 = 2 
to test the sign of f’(x) on each interval gives us f’(—2) = 18 > 0, f’(0) = —6 < 0 and 
f'(2) = 18 > 0, which means that (—1, f(—1)) = (—1,4) is a local maximum, whereas 
(1, —4) is a local minimum. 


Graph of f 


y 


If f(x) > 0 on interval (a,b) then the rate at which f increases is increasing, or the 
rate at which f decreases is decreasing. This motivates the second derivative test: 


Theorem 5.23. The Second Derivative Test. Let f be twice differentiable on (a,b) and 
let f'(c) = 0 for some c € (a,b). Then if f"(c) < 0 the point (c, f(c)) is a (strict) local 
maximum and if f’(c) > 0 then the point (c, f(c)) is a (strict) local minimum. 


Proof. First, assume that f”(c) = L > 0. Then we can find a 6 > 0 so that if 0 < |r—c| < 6 
(x) — f' L ‘(z)-O_ L 
then x € (a,b) and if 2 bare, Ih< x which means that fie a > 0. Hence, 
xL-C L-C 


the sign of f’(a) is the same as the sign of 2 —c, which means that f’(x) > 0 if x € (c,c +6) 
and f’(x) < 0 if x € (c—6,c). Thus, (c, f(c)) is a local minimum by the first derivative test. 
If f’(c) < 0 then —f"(c) > 0 which means that (c,—f(c)) is a local minimum for — f(z) 
and thus (c, f(c)) is a local maximum for f(). 


Example 5.5. Let f(x) = cos(x). Find all local extrema for f(x). 


Solution. In this case the extrema are well known (maxima at multiples of 27 and minima 
at odd multiples of 7) but we will use the second derivative test, using f”(x) = —sin(x) and 
f" (x) = —cos(x). The zeroes of sin(a) are the integer multiples of 7, and that — cos(n7) = 


110 CHAPTER 5. DIFFERENTIATION 


1 > 0 if n is odd and —cos(n7) = —1 < 0 if n is even, which means that we have a local 
maximum of (nz,1) when n is an even integer and a local minimum of (nz, —1) when n is 
an odd integer. 


L’Hospital’s rule has many cases. One can approach from one side or both sides or 
approach infinity, and the limit of the function derivatives can be infinity, negative infinity 
or zero and the ratio can approach a real number or an infinite limit. The infinite limits 
case is a bit awkward and it might be better to prove the simpler case since most problems 
can be reduced to this case, so we prove only part of the cases for L’Hospital’s Rule below 
and then prove the full version in the Supplementary Materials for those who are interested. 
Proving the infinity over infinity case is much simpler if we know the limit exists to begin 
with (and sometimes proofs use that assumption), but without that assumption we have to 
work harder. 


Theorem 5.24. L’Hospital’s Rule (zero over zero case, approaching a real number). Let 

f.g: I — R be differentiable and let g'(x) # 0 for all x € I \ {a} where a is either an 
element of the open interval I or an end point of I. Let lim f@)= 0 = lim g(2) and 
/ 

lim Fe) =I. Then lim f(z) =I 

aa g(x) aa g(x) 

Proof. First, define functions F’, G by setting F(x) = f(x) if « € I \ {a} and set F(a) = 

G(a) = 0. The resulting functions are continuous at a by Exercise 4.3 and Theorem 4.4. 

Next, note that since G’(2) 4 0 for all x € I\ {a} it must follow that G(x) is one to one 
on IM [a,co) and on IM (—co,a] by Theorem 5.21. Since G(a) = 0 we may conclude that 
G(x) #0 for all x € I \ {a}. 

Let {z,} — a, where {x,} C I \ {a}. Then by the Cauchy Mean Value Theorem, 
for each n € N we may choose c, between a and x, so that F’(cn)(G(an) — G(a)) = 
G'(¢n)(F (an) — F(a)). Since G’(en),G(an) # 0 and F(a) = G(a) = 0, it follows that 

/ / 
F' (en) ae) (cn) = L(@n) = Fon) We know 0 < |e, — al < |x» — a| for each natural 
Gen) g'(en)_— gn) G (Xn) 
number n, and {|xz, — a|} — O by Exercise 3.1. Thus, {|c, — a|} — 0 by the Squeeze 


Theorem, so {cn} — a. Hence, by the Sequential Characterization of Limits, we know that 


7 
Elen), — L, and thus Ln) — L. From this, it follows that lim f(z) ab 
J (Cn) 9(Ln) Tete 


g(x) 


Theorem 5.25. L’Hospital’s Rule (zero over zero case, approaching co or —co). Let fig: 
(a,0o) + R be differentiable and let g(x) 4 0 for all x, where a > 0. Let im LO=0= 
/ 
lim g(x) and lim F(a) =L. Then lim f(z) =L 
LOO TOO Gg (x) LOO g(x) 
Furthermore, if we replace co by —oo and (a,oo) by (—o0, a) where a < 0, the theorem 
is stall true. 


Proof. We begin by addressing the first case, where f,g : (a,co) > R and lim fo) =O 


1 
lim g(x). By Theorem 4.14, we know that lim f(x) = 0 if and only if lim, (—) = 0. 
«~— 00 T—0o x 


x0 


111 


f'(z) 
g' (2) 


1 
Likewise, lim g(a) = 0 if and only if lim g(—) = 0. Similarly, lim = [if and only 
@—00 a—70t L>0O 


x 


Te =: = 
if cue oe = L. By the chain rule, (F=)Y = KOS and (o(—)) = es which 
1y\y w7ly)y-1 vas 
Faun CONS ee 
(9(=))! I (a) I (=) 


1 1 1 
Thus, from the the previous form of L’Hospital’s rule, since f(—),g(—) : (0,-) > R 
z L a 
1 1 1 
is differentiable and (g(—))’ is non-zero and lim f(—) = 0 = lim g(—), and we also 
x xz—>0+ x z>0t) 6 


lyy 11 ! 

know that lim UG) = lim FG) = lim F(z) 

0+ (9(5))’  as0t g'(=) B00 g'(x) 

i 

= lim Ite) = L. Using Theorem 4.14 again gives us that lim —~ = TL 
20+ g(4) x-r00 g(x) 


The case where we replace oo by —oo is similar except that we replace x > 0° by 


= L, we are able to conclude that 


f(x) 


1 1 
x — 0° and replace (0, ae by (7 0). 


Theorem 5.26. Let f(x) be a one to one continuous function on (a,b). Then f(x) is 
strictly monotone and f~‘(x) is continuous and strictly monotone. Furthermore, if f(x) is 
increasing then f—‘(a) is increasing, and if f(x) is decreasing then f—‘(x) is decreasing. 


Proof. We first claim that if f is not monotone then there are points x1 < x2 < x3 in (a,b) so 
that either f(x1) < f(x2) and f(x2) > f(x3) or f(a1) > f(x2) and f(x2) < f(x3). To prove 
this claim, suppose that the claim is false. That is, suppose that f is not monotone but for 
any points 71 < x2 < 43 in (a,b), if f(a1) < f(x2) then f(x2) < f(x) and if f(a1) > f(x2) 
then f(r2) > f(x3). Let 21,22 € (a,b) with 71 < rg. Assume f(x 1) < f(a). Then if 
x2 <x“ <y <b we know that since f(x1) < f(x2) it follows that f(x2) < f(x), from which 
we conclude that f(x) < f(y). Likewise, if 71 < x < x2 then f(x) < f(x) since otherwise 
f(x) < f(2) contradicting out supposition that the claim is false, so f(a1) < x and by 
the preceding argument then f(x) < f(x2). Finally, if a <a < 2; then f(x) < f(x1) since 
otherwise f(x) > f(a1) and f(a#1) < f(a2) contradicting the supposition that the claim is 
false. It follows that f is increasing on all of (a,b), contradicting the supposition that f 
is not monotone. Similarly, if f(a) > f(x2) it follows that the function f is decreasing, 
contradicting to the assumption that f is monotone. Hence, the claim is true. 

We now show that f is strictly monotone. Suppse f is not strictly monotone. Then 
by the claim, there are points 7; < x2 < 23 in (a,b) so that either f(a) < f(x) and 
f(x2) > f(x3) or f(a1) > f(x2) and f(x2) < f(x3). In the former case, if we choose k 
between max(f(x1), f(v3)) and f(x2) then by the Intermediate Value Theorem we can find 
points c1 € (#1, #2) and ce € (2,273) so that f(c1) = k = f(c2) so f is not one to one. In 
the latter case the proof is similar, choosing k between min(f (21), f(a3)) and f(x2). Thus, 
f is strictly monotone. 

Next, assume that f is increasing, and let y,; < yo in the range of f. Then there are 
points 71,22 € (a,b) so that f(x1) = y; and f(x) = y2. Suppose f~'(y,) > f-*(y2). Then 
since 21 > £2 it follows that f(z1) > f(x2), so y: > y2, which is impossible. Similarly, if f 
is decreasing then f~! must be decreasing. 


112 CHAPTER 5. DIFFERENTIATION 


Finally, to show continuity we will assume that f is strictly increasing, let yo = f (xo) for 
some Xo € (a,b) and let € > 0. Choose 21, x2 € (a,b) so that rp —€ < 41 < 40 < XQ < XO+E, 
and let y; = f(x1) and yo = f(x2). Let 6 = min(yo—y1, ye—yo). Then if |y—yo| < 6 it follows 
that y. < y < yo and hence 2p —€ < 21 < f '(y) < 22 < 29 +6, so | f+ (yo) — f+ (y)| <e. 
Thus, f~! is continuous. If f is decreasing then the argument is similar. 


Theorem 5.27. Inverse Function Theorem. Let f(x) be a one to one continuous function 
on (a,b). If f'(x9) exists and is non-zero for some x9 € (a,b) and g(x) = f \(x) then 


; ee: 
g (f(x0)) yo f'(xo) ¥ 


Proof. Let yo = f(xo). By Theorem 5.26 we know that g is continuous on f((a,6)). Let 
{yn} be a sequence of points in f((a,b)) \ {yo} so that {yn} — yo. 
Then {g(yn)} + g(yo) = xo by the Sequential Characterization of Continuity, and 


g(Yn) # Xo for any n EN. Since f is differentiable at x9 we know that lim f(x) — F(@0) = 


r->2XO LO) 


f'(xo). Since f’(xo) 4 0 it follows that jim a = an _ om 


g(Yn) — 9(yo) 
Fond) — foo)” Pag) Fe) ea) 


for any n € N since f is one to one. Since f and g are inverse functions, this means 
that for any sequence of points {yn} C f((a,b)) \ {yo} so that {y,} + yo it is true that 
f (Yn) — g(yo) = ¢ JWn) = ayo) y 


, so, by the Sequential 


Characterization of Limits, { 


> . Thus, from the Sequential Characterization 
f(9(Yn)) — Flg(yo)) Yn — Yo f"(xo) 
— 1 
of limits again, we conclude that lim ay) ~ 9Yo) =z ; 
yuo YY — Yo f'(&o) 


We used sequences for this proof, but we could have used Theorem 4.11 as well. It might 
be instructive for the reader to write out a proof of the Inverse Function Theorem using 
Theorem 4.11. 


The Inverse Function Theorem in higher dimensions is quite important and is the key 
to the Implicit Function Theorem. It is still useful in one variable, primarily for purposes of 
showing a derivative exists for an inverse function. While the theorem can directly give us 
the derivative of the inverse function in some cases, in others simply knowing the derivative 
exists can let us use the chain rule to differentiate the inverse function. When an inverse 
of a value can be found directly, this theorem can quickly give the derivative of the inverse 
function based on the derivative of the original function, as shown in the following example. 


Example 5.6. Let f(z) =a? +22 +22 +1. Let g(x) = f-\(a). Find g'(1). 


Solution. First, note that f(x) = 52+ + 3x? +2 > 0 for all « which means that f is one to 


1 1 
one and has an inverse. Since f(0) = 1, g/(1) = =n. 


f(0) 2 


113 


Exercises: 


Exercise 5.1. Prove Theorem 5.19. 


Exercise 5.2. If f(x) =a", where n €N then f’(x) = na". 


Exercise 5.3. If r € Q then prove that (x")' = ra’~' when this expression is defined. In 
this theorem we use the convention that we replace 0x~! by 0 for purposes of this formula 
even at x =0 and replace x° by 1 for all x, even at x =0. 


Exercise 5.4. If f is differentiable at xq and g is differentiable at f(xo) then gof is defined 
on an open interval containing xo. 


Exercise 5.5. Let ce sa - Let T, be the triangle with vertices (0,0), (1,0) and 


(cos(x), sin(x)) and let To be the triangle with vertices (0,0), (1,0) and (1,tan(x)). Then 
the circular sector S of the unit circle between the positive x-axis and the ray based at the 
origin through the point (cos(x),sin(x)) contains the triangular disk bounded by T, and is 
contained in the triangular disk T>. Assuming that the area enclosed by T, is understood to 
be less than or equal to the area within the circular sector S, which is less than or equal to the 


area enclosed by Ty, and using standard formulas for area of a triangle and a circular sector, 
sin(x 
as well as standard trigonometric identities, show that cos(x) < sng) <1, lim sin( a) =0; 
4 6 i ied 
sin(x cos(a%) — 1 
lim cos(x) = 1, lim uz) osoet =; 
0 «2-0 


= 1 and lim 
t— «0 


114 CHAPTER 5. DIFFERENTIATION 


Comparing Areas to Sine 


nxn 


y 


sin(zx) 


Exercise 5.6. Using the result that lim = 1, and the sine and cosine sum and 
be 


difference of angle formulas and iihanorein identities (or the sum to product or other 
trigonometric identities if preferred, all assumed without proof in this development), show 
that (sin(x))’ = cos(2), (cos(a))’ = — sin(x), (tan(x))’ = sec?(x), (ese(x))’ = — ese(x) cot(2), 
(sec(x))’ = sec(x) tan(x) and (cot(«))’ = — esc?(x). 


Exercise 5.7. A degree n polynomial can have at most n real zeroes. 
Exercise 5.8. Let f’(x) > 4 for allx ER, and let f(0) =2. Then f(2) > 10. 


Exercise 5.9. Give an example, with proof, of a function which is continuous but not 
differentiable. 


Exercise 5.10. Let f: I > R be a differentiable function, where I is an interval, having 
the property that f'(x) is bounded. Then f is uniformly continuous. Furthermore, if M > 
|f’(x)| for all x € I then | f(x) — f(y)| < M|x —y| for all x,y € I so thatx Ay. 


Exercise 5.11. If f’(a) > 0 then there is an open interval (c,d) containing a so that if 
en < ao <d then fai) < f(a) <7 (ee): 


115 


Exercise 5.12. If f(x) is non-decreasing and differentiable on (a,b) then f’(x) > 0 for all 
x € (a,b). Furthermore, if f is non-decreasing and differentiable on [a,b] then f'(x) > 0 on 
(a, b]. 


Exercise 5.13. If f(x) is non-increasing and differentiable on (a,b) then f'(x) <0 for all 
x € (a,b). Furthermore, if f is non-increasing and differentiable on [a,b] then f'(x) > 0 on 


a, b]. 


Exercise 5.14. (a) If f is a differentiable odd function (meaning that f(x) = —f(—<) 
for all x € dom(f)) then f'(x) is an even function (meaning that f'(—x) = f(x) for all 
x € dom(f)). 

(b) If f is an even differentiable function then f(x) is an odd function. 


Exercise 5.15. Using the Inverse Function Theorem, we can argue that if we restrict sin(x) 
—T T SM he : : 
to the domain — <x < = then the inverse of this function is differentiable on the interior 


of its domain by the Inverse Function Theorem. Using this fact (or otherwise), show that 
1 
sie] ! 
sin “(x))’ = —————.. 
ao ery er 


Exercise 5.16. In a similar manner to the preceding theorem, find intervals on which the 
functions tan(x) and sec(x) could be restricted so that they would be invertible over their 
restricted domains, and derive formulas for the derivatives of tan~'(x) and sec”!(x). Using 
the fact that each of these inverse trigonometric functions, when added to its corresponding 


: : Tv F , : a = 
inverse co-function has a sum of 5 (or otherwise) find derivatives for cos—‘(x), csc” +(x), 


and cot~*(a). 


Exercise 5.17. Give an example, with proof, of a function which is differentiable at a point, 
whose derivative is not differentiable at that point. 


Exercise 5.18. Give an example, with proof, of a function whose derivative is not bounded 
which is uniformly continuous. 


Exercise 5.19. For every non-negative integer n, if h(x) = f(x)g(x), where f and g aren 
n 


times differentiable, then h'™) (x) = > (") fO9(a)g (ae Ms 
i=0 


Exercise 5.20. Let « > 0 and leth,g: D— R be differentiable on (a—e,a+e). Let f bea 
real valued function so that f(x) = g(x) ifa-—e<a<aandlet f(x) =h(x) ifa<u<ate, 
where f is continuous atx =a. If lim g/(r) =k= lim, h'(x) then f'(x) =k. 

wa” wa 


116 CHAPTER 5. DIFFERENTIATION 


Hints: 


Hint to Exercise 5.1. Prove Theorem 5.19. 


Parallel the proof (or use the result) of Theorem 5.18. 


Hint to Exercise 5.2. If f(x) = x", where n € N then f'(x) = na"! (where x° is 
understood to mean 1, even at zero). 


Use either induction with the product rule or the Binomial Theorem with the definition 
of derivative. 


Hint to Exercise 5.3. If r € Q then prove that (x")’ = ra”! when this expression is 
defined. In this theorem we use the convention that we replace 0x! by 0 for purposes of 
this formula even at x =0 and replace x° by 1 for all x, even at x =0. 


Use the Inverse Function Theorem (and probably the chain rule) to differentiate root 
functions. Use the quotient rule for negative integer powers. Then use the chain rule to 
differentiate the composition. 


Hint to Exercise 5.4. If f is differentiable at xo and g is differentiable at f(xo) then gof 
is defined on an open interval containing xo. 


Remember that to be differentiable at a point implies the function is defined on a 
neighborhood about that point. Use the continuity of f to find an interval small enough, 
centered at x9, so that its image lies within an interval on which g is defined. 


Hint to Exercise 5.5. Let 2 <a <. Let T, be the triangle with vertices (0,0), (1,0) 


and (cos(x), sin(x)) and let T2 be the triangle with vertices (0,0), (1,0) and (1, tan(x)). Then 
the circular sector S of the unit circle between the positive x-azis and the ray based at the 
origin through the point (cos(x),sin(x)) contains the triangular disk bounded by T, and is 
contained in the triangular disk T>. Assuming that the area enclosed by T, is understood to 
be less than or equal to the area within the circular sector S, which is less than or equal to the 
area enclosed by Tz, and using standard formulas for area of a triangle and a circular sector, 


sin(x 
as well as standard trigonometric identities, show that cos(x) < ne) sl, lim sina) = 0; 
x w~— 
—1 
lim cos(x) = 1, lim sme) = 1 and lim col ae = 03 
«x—0 «x—0 x «0 x 


Use the Squeeze Theorem and the identity tan(«) = nae 


and the fact that sin(x) is 


an odd function and cos() is an even function. 


117 


in(x) 


si 
Hint to Exercise 5.6. Using the result that lim = 1, and the sine and cosine sum 


x 
and difference of angle formulas and pythagorean identities (or the sum to product or other 
trigonometric identities if preferred, all assumed without proof in this development), show 
that (sin(x))’ = cos(a), (cos(a))! = — sin(x), (tan(«))’ = sec?(x), (ese(x))’ = — esc(x) cot(z), 
(sec(z))’ = sec(x) tan(a) and (cot(x))’ = — esc?(x). 


Use identities for the derivative of sine and cosine (pythagorean or sum to product). 
The other function derivatives follow from the quotient rule. 


Hint to Exercise 5.7. A degree n polynomial can have at most n real zeroes. 


Use induction and Rolle’s Theorem. 


Hint to Exercise 5.8. Let f(x) > 4 for allx ER, and let f(0) =2. Then f(2) > 10. 


Use the Mean Value Theorem. 


Hint to Exercise 5.9. Give an example, with proof, of a function which is continuous but 
not differentiable. 


Try to think of a function that forms a corner or cusp or has a derivative approaching 
infinity at a point (probably at zero). 


Hint to Exercise 5.10. Let f: 1 > R be a differentiable function, where I is an interval, 
having the property that f'(x) is bounded. Then f is uniformly continuous. Furthermore, 


if M > |f'(x)| for alla € I then |f(x) — f(y)| < M|x —y| for all x,y € I so thatx #y. 


Use the Mean Value Theorem. 


Hint to Exercise 5.11. Jf f’(a) > 0 then there is an open interval (c,d) containing a so 
that ifc<a@1<a< 22 <d then f(21) < f(a) < f(x). 
; oe _, f(x) — f(a) 
Choose a short enough open interval containing a so that the difference quotient ——————— > 
La 
0 on that interval. 


Hint to Exercise 5.12. If f(x) is non-decreasing and differentiable on (a,b) then f'(x) > 0 
for all x € (a,b). Furthermore, if f is non-decreasing and differentiable on [a,b] then 
f'(x) > 0 on [a,b]. 


Use the Comparison Theorem. 
Hint to Exercise 5.13. If f(x) is non-increasing and differentiable on (a,b) then f’(x) <0 


for all x € (a,b). Furthermore, if f is non-increasing and differentiable on [a,b] then 
f'(x) > 0 on [a,b]. 


118 CHAPTER 5. DIFFERENTIATION 


Use the Comparison Theorem. 


Hint to Exercise 5.14. (a) If f is a differentiable odd function (meaning that f(x) = 
—f(—a) for all x € dom(f)) then f'(x) is an even function (meaning that f'(—x) = f'(x 
for all x € dom(f)). 


(b) If f is an even differentiable function then f’(x) is an odd function. 
Use either the definition of derivative or the chain rule. 


Hint to Exercise 5.15. Using the Inverse Function Theorem, we can argue that if we 
—1 T 

restrict sin(x) to the domain a <a< 3 then the inverse of this function is differentiable 

on the interior of its domain by the Inverse Function Theorem. Using this fact (or otherwise), 


show that (sin~!(a))! = 4 
1— «x? 
Start with y = sin-!(x) so sin(y) = 2. Then use the Inverse Function Theorem to 
conclude y is differentiable, and differentiate both sides using the chain rule. 


Hint to Exercise 5.16. In a similar manner to the preceding theorem, find intervals on 
which the functions tan(x) and sec(x) could be restricted so that they would be invertible 
over their restricted domains, and derive formulas for the derivatives of tan! (2) and 
sec (x). Using the fact that each of these inverse trigonometric functions, when added 


to its corresponding inverse co-function has a sum of 3 (or otherwise) find derivatives for 


cos‘ (a),csc~!(x), and cot +(x). 


Normally, the convention is to restrict cos(x) to [0,7], cot(a) to (0,7), and tan(zx) to 


T 1 7 3 
Caray) restrict sec(x) to [0, 5) U[r, ne and csc() to (0 
process as outlined in the previous exercise. 


3 
5 sl U(a, ol Then use the same 


Hint to Exercise 5.17. Give an example, with proof, of a function which is differentiable 
at a point, whose derivative is not differentiable at that point. 


Try to get the second derivative to either approach infinity or simply not exist. Think 
of a function that is not differentiable at zero which has an antiderivative. 


Hint to Exercise 5.18. Give an example, with proof, of a function whose derivative is 
not bounded which is uniformly continuous. 


Think of a function which is continuous on a closed interval whose slope approaches 
infinity (consider a rounded or rapidly oscillating function approaching a vertical tangent). 


Hint to Exercise 5.19. For every non-negative integer n, if h(x) = f(x)g(x), where f 


and g are n times differentiable, then h(x) = oe C fe) (2) (2). 
i=0 


119 


Model the proof after the argument for the Binomial Theorem using the product rule. 


Hint to Exercise 5.20. Let « > 0 and leth,g: DR be differentiable on (a—€,a+e). 

Let f be a real valued function so that f(x) = g(x) ifa—e <a <a and let f(x) = h(a) 

ifa<a<ate, where f is continuous atx =a. If lim g(x) =k = lim, h'(x) then 
«wr7a wa 


fn =k, 


Take one sided limits and use Theorem 4.15. 


120 CHAPTER 5. DIFFERENTIATION 


Solutions: 


Solution to Exercise 5.1. Prove Theorem 5.19. 


Proof. Let c,d € [a,b], where c < d. By the Mean Value Theorem, there is some point 
t € (c,d) so that f’(t)(d—c) = f(d) — f(c). Since f'(t) < 0 and (d—c) > 0, it follows that 
f'(t)(d—c) <0, so f(d) — f(c) < 0, which means that f(d) < f(c). Hence, f is decreasing 
on [a, b]. 


Solution to Exercise 5.2. If f(z) = 2”, where n € N then f'(x) = na” (where x° is 
understood to mean 1, even at zero). 


Proof. We can use the Binomial Theorem or induction. We will use induction. We have 
already shown that the derivative of y = x is 1 = 1x°. We assume that the formula is true 
for a natural number k. Then (x**1)! = (xa")! = (1)(2*) + a(ka*—1) = (k + 1)2* by the 
product rule and the induction hypothesis. The result follows by induction. 


Solution to Exercise 5.3. If r € Q then prove that (x")! = ra’! when this expression is 
defined. In this theorem we use the convention that we replace 0x! by 0 for purposes of 
this formula even at z =0 and replace x° by 1 for all x, even at x =0. 


Proof. Let r = P where p is an integer and q is a natural number. First, note that if p is 
zero the derivative is zero. If p is a negative integer then —p is a natural number, so by the 


1 0——px-P-1 
i DY = hes 
preceding argument (x?)' = (a=) = 7, 


using the quotient rule. This simplifies 


1 
to px?—!, Next, using the Inverse Function Theorem, we know that since x@ is the inverse 
di 
of x7, the function y = x¢ is differentiable. Hence y? = x and by the chain rule qy? 'y’ = 1 
Pate P i 
so y! = —a4 is Finally, again using the chain rule, we have that if y = aa = (a7)? then 
q 
1 1 


y= p((x)?1) (way! = pat) a = ot as desired. 


Solution to Exercise 5.4. If f is differentiable at xq and g is differentiable at f(xo) then 
gof is defined on an open interval containing xo. 


Proof. Since g is differentiable at f(xo) there is an €g > 0 so that (f(xo) — €g, f(@o) + €g) C 
dom(g). Since f is differentiable at xo there is an ef > 0 so that (ao —€f, 29 +e) C dom(f). 
Since f is differentiable at x9, by a f is continuous at x9 by Theorem 5.3. Thus, we can 
find 6; > 0 so that if jap — x| < 6, and x € dom(f) then | f(x) — f(xo)| < €,. Thus, if 
|x — to| < 6 = min{es, 61} thenz € dom(f) and | f(x) — f(xo)| < €g, so (to — 6,49 + 6) C 
dom(go f). 


121 


Solution to Exercise 5.5. Let ie <a< a Let T, be the triangle with vertices 
(0,0), (1,0) and (cos(x),sin(x)) and let To be the triangle with vertices (0,0), (1,0) and 
(1, tan(x)). Then the circular sector S of the unit circle between the positive x-axis and the 
ray based at the origin through the point (cos(x), sin(a)) contains the triangular disk bounded 
by T, and is contained in the triangular disk T2. Assuming that the area enclosed by T, is 
understood to be less than or equal to the area within the circular sector S, which is less than 


or equal to the area enclosed by T2, and using standard formulas for area of a triangle and a 
- sin(x 
circular sector, as well as standard trigonometric identities, show that cos(x) < on) <1, 
x 
sin(x cos(xz) — 1 
lim sin(x) = 0, lim cos(x) = 1, lim sin(Z) = 1 and lim cont) = 2 = 0. 
x0 «2-0 «2-0 «2-0 


sin(x) 


Proof. The area of triangle AABC is , which is less than the area of the sector of the 


tan(z) 


circle ABC, which is S which is less than the area of triangle AADC, which is 


Hence, we have sin(a) < x < tan(x) for small positive x (and —xz < sin(x) < 0 for small 


i in(x sin(x 
negative x). It follows that =z) <1. Since x < zal ) we know that cos(x) < ne) 
- cos(2) 
Next, since 0 < |sin(x)| < |2| for "<2 < from this picture, we conclude from 


Exercise 4.1 (or just the Squeeze Theorem) that lim sin(z) = 0 since lim |x| = 0. Next, 
xt ae 


note that cos(x) = \/1—sin?(«) for all 2 since x is defined for all values 1 — sin?(x), 


so, since f(x) = Vx is continuous, we have that lim cos(z) = 1 by Theorem 4.11. Since 
Hie 


sin(x) 


cos(x) < < kaor (<< ae and lim cos(#) = 1 and lim 1 = 1, it follows from 
: 2 x0 x0 


the Squeeze Theorem that lim, sin(@) = 1. If x is negative then both sin() and tan(«) 
«0 x 
are negated, so ne) is still between cos(x) and 1, which means that tim na) = 1, so 
x r—0— 
ig a 
x—0 x P 
—1 —1 1 —1 
Sie Tint cos(x) ae (cos(x) — 1)(cos(a) + 1) — lim £2 (x) 
20 x 20 x(cos(x) + 1) «0 x(cos(x) + 1) 
anit oe sin?(z) ie sin(z) — sin(2) 7 ay(2) = 
«+0 a(cos(x) +1) 230 x cos(x)+1 2 


in(x) 


si 
Solution to Exercise 5.6. Using the result that lim = 1, and the sine and cosine 
rL—> 


x 
sum and difference of angle formulas and pythagorean identities (or the sum to product or 
other trigonometric identities if preferred, all assumed without proof in this development), 


show that (sin(a))’ = cos(a), (cos(a))’ = — sin(x), (tan(«))’ = sec?(x), (ese(a))’ = — esc(x) cot(z), 
(sec(x))! = sec(x) tan(x) and (cot(x))’ = — esc?(x). 
; _ cos(h)—1 ,, (cos(h) — 1)(cos(h) +1) _ |. (cos?(h)-—1) | 
Proof. First, note that He a = He NGL) = lim leash) 4) = 
sin?(h) sin(h) — sin(h) 


h+0 h(cos(h) + 1) hoo A (cos(h) + 1) 


122 CHAPTER 5. DIFFERENTIATION 


Phnusss(ebal ayy cos(z + h) — cos(x) =e cos() cos(h) — sin(x) sin(h) — cos(2) _ 
h>0 h0 h 

lim — ana + cos (x) 5) =e sin(x)(1) +0 by Theorem 4.6, so (cos(x))! = 
—? 

—sin(x). Likewise, (sin(x))’ = lim ane eh) nw) = 

— 

tin sin(x) cos(h) + cos(x) sin(h) — sin(a) ests gy St i ag “deesaley, 
h-0 h0 h 


From this we can use the quotient rule to obtain the remaining derivatives: 


Gene — se cos(z) ee sin(2) = Mee @). 
, _ _cos(#),, | —sin(x) sin(x) — cos(a) cos(x) _ Ore 
(cot(x)) ~ Ssin() _— ; | sin? (2) — ( ) 
ne ED pea 
(sec(x))! = eae) ee, = sec(x) tan(2) 
(csc(a)) = (ae) = sin2(x) = — csc(x) cot(x) 


Solution to Exercise 5.7. A degree n polynomial can have at most n real zeroes. 


Proof. Proceed by induction. A degree one polynomial has form P(x) = ax+b where a # 0. 


Thus, there is exactly one solution 7 = =e Assume that a degree k polynomial can have 
at most k real zeroes. Let P(x) = Gein + ...+a,x% + ag be a degree & + 1 polynomial. 
Then P(x) = (k + 1)x* +... + a1 is a degree k polynomial and has no more than k real 
zeroes by the induction hypothesis. Suppose P has k+2 zeroes 71 < x2 < ...@%42. Then by 
Rolle’s Theorem, there is a point ¢; € (#;,7;41) so that P’(c;) = 0 for each 1 <i< k +1, 
which means that P’ has k + 1 zeroes, contradicting the induction hypothesis. The result 
follows by induction. 


Solution to Exercise 5.8. Let f’(x) > 4 for allx € R, and let f(0) =2. Then f(2) > 10. 


Proof. Since f is differentiable at all real numbers, it is also continuous at all real numbers, 
so by the Mean Value Theorem there is a point ¢ € (0,2) so that f’(c)(2—0) = f(2)—f(0) = 
f(2) — 2. Since f’(c) > 4 we know that f(2) — 2 > 8 and hence f(2) > 10. 


Solution to Exercise 5.9. Give an example, with proof, of a function which is continuous 
but not differentiable. 


Proof. The most common example is ||. We know that f(x) = x and f(x) = —2x are both 
continuous by earlier theorems, so || is continuous at every point except zero. However, 
lim |z|= lim —x =|0| = lim x= lim |z\, so it follows that |z| is also continuous at 0. 
z—0- z—0- xz—0t z—0t 

eS OY ce =e |x| — 0 


On the other hand lim = lim — =~—1 whereas lim = lim oF 1. 
x30- «2 —O 2307 &£ a 30+ xc —O0 a 0t © 


Thus, |x| is not differentiable at 0. 


123 


Solution to Exercise 5.10. Let f : I > R be a differentiable function, where I is 
an interval, having the property that f'(x) is bounded. Then f is uniformly continuous. 
Furthermore, if M > |f'(x)| for all x € I then |f(x) — f(y)| < M|x—y| for all x,y € I so 
that x # y. 


Proof. Choose M > 0 so that | f’(a)| < M for all x € J and let € > 0. Then setting 6 = a 


we know that if |a — y| < 6 for x,y € I, then by the Mean Value Theorem there is a point 
c€ (x,y) so that f"(c)(y—2) = f(y) — f(#) which means that | f"(¢)|ly— 2] = lf) — f(@)| 
and therefore | f(y) — f(x)| < Tid =e. Thus, f is uniformly continuous. 


More generally, for any x < y in I we can pick c between x and y so that | f’(c)||y—2| = 
| f(y) — f(«)| by the Mean Value Theorem, which means that | f(x) — f(y)| < Mla — y|. 


Solution to Exercise 5.11. Jf f’(a) > 0 then there is an open interval (c,d) containing a 
so that ifc<a@1<a< 22 <d then f(x) < f(a) < f(x). 


Proof. First, by the definition of differentiable we can choose €; > 0 so that (a—«,,a+€1) C 
dom(f). Next, since f’(a) > 0 we can find 0 < 6 < « so that if |x — xo| < 6 then 


f= 10) pg < LO), god te LOL), Ho 
xq <a+0 we know that 7; —a < 0 and 22 —a > 0, so it follows that f(z1) < f(a) and 


f(a) < f(a). 


>0. Ifa-éd<a41<a< 


Solution to Exercise 5.12. If f(x) is non-decreasing and differentiable on (a,b) then 

f'(x) > 0 for all x € (a,b) Furthermore, if f is non-decreasing and differentiable on [a,b] 

then f'(x) > 0 on [a, BJ. 

Proof. Fix « € (a,b). Since f is non-decreasing on (a,b), we know that if |h| is small 
fixe 

enough so that (~—|h|,7+|h|) C (a,b) then Has ) F(z) 

f(e+h) — f(z) 


> 0. Thus, by the Comparison 


= f'(x) > 0. If f is also differentiable 


= f'(a) > Oand lim f(o+h) = F() 


Theorem for limits we know that lim 
h>0 


at a and b then similarly we have lim 
hot h h-0- h 


F(t 0: 


Solution to Exercise 5.13. If f(x) is non-increasing and differentiable on (a,b) then 
f'(x) <0 for all x € (a,b). Furthermore, if f is non-increasing and differentiable on [a,b] 
then f'(x) <0 on [a, BJ. 


124 CHAPTER 5. DIFFERENTIATION 


Proof. Fix x € (a,b). Since f is non-increasing on (a,b), we know that if |h| is small enough 
ye 

so that (x — |h|,2 + |h]) C (a,b) then f(e+ ) f(z) 

f(a+h) — f(z) 


< 0. Thus, by the Comparison 


= f'(x) < 0. If f is also differentiable 


Theorem for limits we know that lim 
h>0 


om 


lo+ = 10) < ¥(q) <oand tm LO+H= SO) 


at a and b then similarly we have lim 
h-0 h-0- h 


f'(b) <0. 


Solution to Exercise 5.14. (a) If f is a differentiable odd function (meaning that ae ) = 
—f(—a) for all x € dom(f)) then f'(x) is an even function (meaning that f'(—ax) = f'(x) 
for all x € dom(f)). 


(b) If f is an even differentiable function then f’(x) is an odd function. 


Proof. (a) aes the Chain Rule, La = —f'(—x). On the other hand, f(—x) = —f(z) 
and (—f(x))’ = —f'(x). Thus, f’(—x) = f(z) and so f’ is even. 

(b) Using the Chain Rule, (f(— ays = —f'(—x). On the other hand, f(—x) = f(x) and 
(f(x))' = f'(x). Thus, ae = f'(x) and so f" is odd. 


Solution to Exercise 5.15. Using the Inverse Function Theorem, we can argue that 
; : é aground as . : ? : 
if we restrict sin(a) to the domain — < x < = then the inverse of this function is 


differentiable on the interior of its domain by the Inverse Function Theorem. Using this 


act (or otherwise), show that (sin~'(x))! = ———. 
fact ( (sin "(0)) = Fs 


Proof. As mentioned, the Inverse Function Theorem guarantees that if y = sin~'(a) then y/ 


1 
exists, so by the Chain Rule we have sin(y) = x so cos(y)y’ = 1, which means y’ = = 


cos(y) 
1 ate il 


J/1—sin?(y) fla 


Solution to Exercise 5.16. In a similar manner to the preceding theorem, find intervals 
on which the functions tan(x) and sec(x) could be restricted so that they would be invertible 
over their restricted domains, and derive formulas for the derivatives of tan! (2) and 
sec (x). Using the fact that each of these inverse trigonometric functions, when added 


to its corresponding inverse co-function has a sum of 3 (or otherwise) find derivatives for 


cos *(x),csc~!(x), and cot~+(a). 


125 


Proof. Normally, the convention is to restrict cos(xz) to [0,7], cot(az) to (0,7), and tan(z) 


to Cao. There are multiple conventions for sec(x) and csc(x) but it seems to be easiest 
3 3 
if we restrict sec(x) to [0, *) U [x, >) and csc(x) to (0, sl tha a because this makes 


derivatives simpler. 
Using the same methods described above, the Inverse Function Theorem guarantees 
that each of these inverse functions is differentiable, so we proceed as we did for sin7!(z). 


If y = tan~'(a) then tan(y) = x so y’sec?(y) = 1, which means that y! = pit = 
sec?(y) 
1 1 


1+tan2(y) 1+22' 
If y = sec 1(x) then sec(y) = x so y’sec(y)tan(y) = 1, which means that y/ = 
1 1 1 


tan2(y)—1 aVa2—1 


sec(y) tan(y) sec(y 


NS 


7 —1 
Since cos~!(a) = — — sin7'(a) we know that (cos~!(a))’ = ————. 
(0) = F sin "(v) (cos"(a))! = as 
= 
Since cot~!(x) = . —tan~!(x) we know that (cot~!())/ = Toe 
Since esc”! (2) = zs sec! (a) we know that (csc7+(x))’ = —— 
2 rv x? —1 


Solution to Exercise 5.17. Give an example, with proof, of a function which is differentiable 
at a point, whose derivative is not differentiable at that point. 


1 
Proof. Let f(x) = 2? sin(—) if « # 0 and let f(0) = 0. Differentiation at every point 


other than zero can be conducted using the Chain Rule and Product Rule to give f’(x) = 
st cae val Ls aie ie cee ae x’ sin(z)—-0 _ ites 
x sin(—)—cos(—). owever, at x = 0 we have f'(x) = lim —>) = lim 2 sin(—) = 


1 1 1 
by Exercise 4.16. Thus, f(x) = 2x sin(—) —cos(—) if #0 and f’(0) = 0. Since ee +0 


1 

and { f’ Cee — —1, it follows that f’ is not continuous at 0 and therefore not differentiable 
nt 

at 0. 


Solution to Exercise 5.18. Give an example, with proof, of a function whose derivative 
is not bounded which is uniformly continuous. 


1 
Proof. Let f(x) = xsin(—) on (0,1]. Then f is differentiable and by the Chain Rule, 


. ; 1, _ cos(s) 
Quotient Rule and Product Rule we know that f'(x) = sin(—) - — 
1 
Us) = {—2n7} which is unbounded, so f’ is unbounded. 
nt 
To see that f is uniformly continuous, let 0 < « < 1. By Theorem 4.18, since f is 


é € A 
continuous f is also uniformly continuous on the closed interval re 1]. Thus, there is a 


Note that 


126 CHAPTER 5. DIFFERENTIATION 


6; > 0 so that if |x — y| < 6 then |f(x) — f(y)| <eifa,ye (5. 1]. We set 6 = min{ 5,61}. 
1 
Let |x — y| < 6 with y > x. We know that x sin(—) is between (or equal to) x and —x for 


1 
all x € (0,1) since —1 < sin(—) < 1 for all x. Thus, the largest value f takes on in the 
interval [z, y] nn exceed y and the a ee is at least —y, so |f(y) — f(x)| < 2y. 


Since y— 2a < ; it follows that if i >= 5 then r> so since |a — y| < 0, we know that 


|f(x) — f(y)| < e. Otherwise y < 5 ar means that | f(y) — f(x)| < 2y < «. Hence, f is 


uniformly continuous. 


Solution to Exercise 5.19. For every non-negative integer n, if h(x) = f(x)g(x), where 


f and g are n times differentiable, then h\ a= S- ( 
i=0 


n 


‘) gor (x)9(c). 


a 


Proof. First, the result is immediate if n = 0. Assume the result wa n =k. Then 


k 
n+) (7) =o oy (‘) pikes) (x) a ))’, which, by the product rule, is 3 nt ee ) (x)g (x)+ 


i=0 


f*—9 (x) g(x), which can be written as & fPU(e ate)+(; )f (x) g**( ay ¢ )+ 


(FF reac We know GE e _ = land (: ie ee ) Saas 


k res | k+1 1 
(+O; ))- ( ; ) by Theorem 2.7 Hence, AED (g = ("3 )f fFH-9) (2) (2), 


=0 


as desired. The result follows by induction. 


Solution to Exercise 5.20. Let « > 0 and leth,g : D— R be differentiable on (a—e€,a+e). 

Let f be a real valued function so that f(x) = g(x) ifa—e <a <a and let f(x) = h(a) 

ifa<x<ate, where f is continuous atx =a. If lim g'(z7) =k = lim, h'(x) then 
«ra wa 


f'() =k. 


Proof. We know h,g are continuous at a since they are differentiable at a. Since f, g and h 
are continuous at a we know that lim g(x) = g(a) = f(x) = lim f(z) since f(x) = g(x) 
wa ta 


for « € (a—«,a). Likewise, lim h(x) = h(a) = f(a). It follows that lim f(z) = fla) = 
zat xr a~ r—a 

lim gx) — 94) = ¢' (a) Sk and) lim ACs ee A lim ERG) = fia) =k: 

“La- r—a sat r—a zat r—a 

Since g/(a) = h'(a) the left and right limits are equal and lim He) ~ a) =i (oa 


Chapter 6 


Integration 


There are different approaches to doing integration theorems. Three popular methods are 
first, using upper and lower sums to get upper and lower integrals to identify the integral 
when an integral exists, second, using limits of Riemann sums with markings that can be 
taken within subintervals induced by partitions and defining the integral to the be the limit 
of such sums if this limit is unaffected by the markings, and third, using a theorem of 
Lebesgue that a bounded function is integrable if and only if the set of discontinuities of the 
function has Lebesgue measure zero. In the third case, we get a powerful tool for determining 
integrability, but we still have to use another method to actually find the integral. We will 
use a development based on the first two ways of describing integrability, and address the 
third technique in the Supplementary Materials section. We may give multiple proofs of 
some results when different approaches seem to have advantages. In the case where we 
use Lebesgue’s characterization of Riemann integrability for the argument we will also give 
an argument using either upper and lower sums or Riemann sums (so no theorem in this 
section will rely in using the Supplementary Materials). 


Definition 32 


Let f : [a,b] — R be bounded. A finite subset P = {2,21,...,2n} C [a, 8], 
understood to be listed in order with x9 = a < 41 < “2 <<... < tn = Dis called a 
partition of [a,b]. If P and Q are partitions of [a, b] and P C Q then we say that Q is 
a refinement of P (or that Q refines P). A collection of points T = {x}, 75,...,27,} so 
that x; € [z;-1, x] for each z € {1, 2,3,...,n} is called a marking of P, and we denote 


the Riemann sum with this marking by S7(f,P -> ee — x;-1). If we let 
M;= sup f(x)andm;= inf _ f(x) then we say that U(f, P -y M;(a 

Ceo =e ey | 40 511 08 
Xj-1) is the upper sum of f with respect to P, and L(f,P -y m,;(x; — %-1) is the 


lower sum of f with respect to P. If the partition is eed then we sometimes 
use the notation M;,m,; for suprema and infima of function values on induced sub- 
intervals of the partition without declaring them to be thus defined in an argument. 
We call |P| = max(21 — 20, £2 — 1, ..., Ln — Ln_1) the mesh of the partition P. The 


127 


128 CHAPTER 6. INTEGRATION 


b 
upper integral of f on [a,b] is denoted (U) | f, and is the infimum of all upper 


b 
sums of f on [a,b]. The lower integral of f on [a,b] is denoted (z) f f, and is 


a 
the supremum of all lower sums of f on [a, 6]. If the upper and lower integrals are 
equal then we say that f is integrable on |a,b] and that the integral of f over |a, b] is 


[e2@ se@) 


Let R? = R x R, the set of ordered pairs (a,b), where a and b are real numbers. 
b 


If f(x) > g(x) for all a < x < b then we say that i f(x) — g(x)dz is the area 


a 


between the curves y = f(x) and y = g(2) in R?. 


In some cases it may be useful to distinguish between infima and suprema of different 
functions on the same subintervals induced by a partition or on different partitions. 


Definition 33 


Let Mi (P) denote the supremum on [x;-1, 2], the ith subinterval induced by 
the partition P for the function f and mi (P (P) to refer to the infimum on ie cee | 
for the function f. If the partition is understood we may just write M; f mi , and if 
the function is understood we may write M;(P),mj;(P) without the SIneEseO 
It is sometimes convenient to take suprema or infima over all partitions (or other 
types of sets) without specifying the set of all partitions, as long as the interval 
over which the partitions is taken is understood. We use the notation sup f(P) and 


inf f(P) to denote the supremum and infimum, respectively, of the function f(P) over 


all possible partitions P the interval in question (or sets P which are understood to 
be an element of a certain collection). 


Not all functions are integrable. Unbounded functions are not integrable, but many 
bounded functions are also not integrable. Here is an example of such a function. 


Example 6.1. Give an example of a bounded function which is not integrable. 


Solution. Let f(x) = 0 if x € Q and let f(x) = 1 otherwise. Let P = {x0, 21, £2, ...,2n} be 
a partition of [0,1]. Since every (tia): x;| subinterval induced by P contains both rational 
n 


and irrational numbers, U(f, P = mite =f) = S (1) (ai — x;-1) = 1, whereas 
i=l 
n 1 
= Lom Xj — 2-1) = 5" (0)(«i- 2:1) = 0. Thus, the upper integral w) | f=l1 
i=1 0) 


and the lower integral (L) [ f =0. Since these are not equal, f is not integrable on (0, 1]. 
0 


129 


Below is an illustration of a lower sum. The area beneath the red rectangles is L(f, P) 
iP = 40, - 1, er 22 gh and f(a) = sin(a) +1. Since the function is continuous, the 
infimum on each subinterval induced by the partition is just the minimum value of the 
function on the subinterval. Multiplying the subinterval length by the minimum value of 
the function on the subinterval gives the area enclosed by the rectangles shown, which is 
less than the area under the curve. 


Below is a picture of the upper sum of the same function with the same partition. The 
area enclosed by the blue rectangles is the upper sum. Note that the area under the curve 
would be between the upper and lower sums. 


2% 


1.5 + 


0.5 + 


Theorem 6.1. Let f : [a,b] > R be bounded. Let P = {x0,%1,...,Un} be a partition of 
[a,b]. Then for any marking T of P, L(f,P) < Sr(f,P) < U(f,P). 


130 CHAPTER 6. INTEGRATION 


Proof. Let T = {x7,25,...,07} be a marking for P. Then for each i < n we know that 
n 


mi < f(z;) < Mi, so ee = m1) icv e: i) So M(x: — 54): 
i=l i=l 


t= 


In the following example, let P = {0,1,2,3,4,5} be a partition of [0,5]. We sketch 
a graph of U(f,P) and L(f,P) for f(x) = x?. In this case, both upper and lower sums 
are sketched in the same picture. we have taken fewer subdivisions in this example, and 
the upper and lower sums do not look very close to the area beneath the curve (which 
the integral since the function is non-negative). In general, we will find that for integrable 
functions, if we can make the mesh of the partition small enough (usually by taking more 
partition points that are evenly spaced in the interval) then we can force the resulting upper, 
lower and Riemann sums to be as close as we wish to the integral, whereas such sums tend 
not to be close to the integral when the number of subdivisions is small. 


Upper and Lower Sums of f 


The area in blue is the difference between the upper and lower sums. The area in red 
is the lower sum. The sum of the two shown areas is the upper sum. Note that the upper 
sum for a partition is an area that is at least equal to the area under the curve, the lower 
sum for the partition is no more than the area under the curve, and that integral functions 
are bounded functions so that the difference between the upper and lower sums (the blue 
area) can be made arbitrarily small. 

A Riemann sum is at least as large as the lower sum and no larger than the upper sum 
for a given partition. It is formed by taking function values at points in each subinterval and 
multiplying those function values by the subinterval lengths and adding the result. This 
corresponds to an area like that shown below. We use the same function and partition as 


131 


the preceding example for this picture. We will use the midpoints of the intervals for the 
marking, however. 


Riemann sum for f 


0.5 115 2 25 3 35 4 4.5 5 


Theorem 6.2. Let f : [a,b] — R be bounded. Let P = {x0,21,...,Un} be a partition of 
[a, b], and let Q be a refinement of P. Then L(f,P) < L(f,Q) < U(f,Q) < U(f,P). 


Proof. We first prove the result is true when Q = PU {gq}, where q © (aj-1,2;), and 
M:= sup f(x). Then U(f,P) —U(f,Q) = Mi(ai — xi-1) — ia Fea a oe 
vE[xj4-1,24] LE|Lj—-1,g 
sup f(x)(a;—-q) > Osince M; >max( sup f(x), sup f(x)) by Exercise 1.17. Similarly, 
x€[q,xi] x€[xi-1,q] re [q,x%] 
L(f,P)— L(f,Q) = mi(zi—ai-1)— __inf — f(x)(q—ai-1)— inf f(x)(ti—@) < 0 since 


i 
x€[x;-1,9] xe (q,xi] 


m <min(_ inf Leh ed?) Thus, L(f,P) < L(f,Q) < U(f,Q) < U(f, P). 


x€[xi-1,9] 
Next, let Q = PU {q1,q@,---;¢m}. Then by the first part of the argument, it follows 
{Q1, 92; 5 Im—1}) < Uy Pu {415 925 +++ In—2}) Sas U(f, P); 


The following figure illustrates why a refinement adding a single point causes the upper 
sum to decrease and the lower sum to increase. 


132 CHAPTER 6. INTEGRATION 


Refinement With Added Point q 


decrease 


Ly x2 x3 q L4 


Theorem 6.3. Let f : [a,b] > R be bounded. Let P and Q be partitions of [a,b]. Then 
L(f,P) < U(f,Q). Furthermore, wo f \ ie wf f. 


Proof. By Theorem 6.2 we know that L(f,P) < L(f,PUQ) < U(f,PUQ) < U(f,Q). 
Since for any partition Q of [a, 7 it is true that U( f, Q) > L(f,P) for each partition P 


of [a,b], we know that U(f,Q) > wf f, so wmf f a lower bound for all upper sums 


U(f,Q) of [a,b], which means that wf f< wf f. 


133 


Theorem 6.4. Let f : [a,b] > R be bounded. Then f is integrable if and only if for every 
€ > 0 there is a partition R of [a,b] so that U(f, R) — L(f, R) <e« 


Proof. First, assume that f is re By the (aia Property, we can find 
b 
partitions P and Q of [a,b] so that U(f,P <w) f f ee © and TAF.) St, | f- 


whieh means that f= 5 < LG, OCA PUO) AUP) UG. y< fs+$ 


2 
Thus, it follows that U(f, . UQ)-L(f, PUQ) <e. 
Next, assume that for every « > 0 there is a partition R of [a,b] so that U(f, R) — 
L(f, R) < «. Let € > 0. Choose R so that U(f, R) — L(f, R) < «. But then by Theorem 6.3 


we know that L(f,R) < <fr<w ee, ),s00<(U ) fr -( Hf t<e 


Since this is true for all « > 0 it follows that wf f= wf f, so f is integrable. 


Theorem 6.5. Let f : [a,b] > R be bounded, and let € > 0 and let P = {20,..., an} be a 
partition of [a,b]. Then there are markings T and R of P so that U(f,P)—Sr(f,P) <« 


and Sr(f,P) — L(f, P) <e€ 


Proof. Since M;= sup f(x) andm;= inf f(a), by the Approximation Property 
vE (x41, 24] LE|Lj—1 Li 


€ 
we can find aie r*,t; € [a;-1, x] for each i € {1, 2,3,...,n} so that M;— f(t?) < boa and 
—a 


f(rj)-m < 5 = 0 obtain markings T = {t}, t5, t,...0>} and R = {rj, 15,73, ...77,} so that 
~ 5 Ey € 

U(f, P)—Sr(f, P) = Mi FE) esa) ors 2 eres) = pag (h-@) = and 
” . a a € 

Sr(f,P)— Lf, P) = > (f(r2) — ma) (@i — tia) < ee So (@i - 24-1) = pug) =e 
i=1 i=1 


Theorem 6.6. Let f : [a,b] > R be integrable. Then for every €« > 0 there is a number 
6 >0 so that if Q is a partition of [a,b] with |Q| < 6 then U(f,Q) — L(f,Q) <e. 


Proof. Since f is integrable, we know that we can find M > 0 so that |f(x)| < M for all 
x € [a,b] and we can find a partition P = {2o,....,%n} so that U(f,P) — L(f,P) < ’ Let 
€ 


6= Tala’ and let Q = {qo,---; dm} be a partition with |Q| < 6. Then U(f,Q) — L(f,Q) = 
Ss" (Mi (Q) — m§(Q))(q — G-1) 

{jEN|[qj-1,4j]C[zi-1,21] for some i<n} 

1 S> (Mi (Q) — mf (Q))(q — q-1)- 


{jEN|xi€(q;-1,9;) for some i<n} 


134 CHAPTER 6. INTEGRATION 


a 


For each i € {1,2,...,n} we know that if [qj-1,qj] © [vi-1, a] then md! (P) < mi (Q) < 


M!(Q) = M!(P) by Exercise 1.17, so S- (Mi (Q) — mi (Q))(q = G1) < 
{7EN|[aj—-1,4;]C[vi-1,24] 
(Mj (P) — mf (P))( sD (4; — aj-1)) < (Mj (P) — mf (P)) (ai — ai-1). Thus, 
{7EN| [95-1595] Clvi-1,24] 
€ 
2 (Mj (Q) — mj (Q))(qj — j-1) $ UF, P) — LP) < 5- 
{jEN|[9;—1,4;]Clwi—1,2] for some i<n} 
Since |Q| < aa and there are at most n—1 integers j so that x; € (qj-1,q;) for some a; € 

P, it follows that S (Mj (Q) — m}(Q))(qj — 4-1) $ 


{jEN|xi€(q;-1,4;) for some x;€P} 


—) <5. Thus, U(f,Q) - L(f,Q) <« 


2M(n—1)(—— 


Theorem 6.7. Let f : [a,b] + R be bounded and let {P,,} be a sequence of partitions of [a, 5] 
b 


so that {|Pn|} > 0. Then i f(x) exists if and only if there is a number I so that for any 


choice of markings T; of pee each i EN, the sequence {Sr,(f,Pn)} > I, in which case 
b 
f(x) =I and, for every « > 0, there is ak €N so that ifn > k then |Sr,(f, Pr) —I| <€ 


regardless of marking Ty. 
Proof. Let {P,,} be a sequence of partitions of [a,b] so that {|P,|} — 0 and let « > 0. 


b 
First, assume i f(«)dx = I. By Theorem 6.6 we can find a 6 > 0 so that if |P| < 6 


then U(f, P) — L(f, P) <e. For some k EN, ifn >k then |P,| < 6. Since I, Sr, (f, Pr) € 
[L(f, Pn), U(f, Pn)] (regardless of marking T,,), it follows that |S7,(f,Pn) — I| < €, so 


Next, assume that for every choice of markings T,, of P,, {Sr (f, Pn)} > I. By Theorem 
6.5, for each n € N we can choose markings U;, and Ly, of Py, so that U(f, Pn) —Su,,(f,; Pn) < 


; and Sz,,(f, Pn) — L(f, Pn) < Choose N € N so that ifn > N then |Sy,,(f, Pn) —I| < 


= and |Sz,,(f, Pn) — Z| < 5. Then U(f, Py) - L(f, Pw) < lU(f, Px) - Suy(f, Pw) + 
lSuy(f, Pn) — I] + |I — Sty (f, Pn)| + |Stn(f, Pu) — L(f, Pn)| < €, so f is integrable. 


b 

Also, |U(f, Pn) = < |U(f, Pn) — Sux (f, Pn)| + lSuy (f, Pr) =| < 5 and [ f(x)dx € 
b b 7 

U(F Pw), Lf, Px} 80 0, Pw) FA Se ee | Hate =I Pe 


b 
i f(x)dz| + |U(f, Pn) -—I| < = Since this is true for every « > 0 it must follow that 


[ teoyee=r 


135 


This establishes the characterization of integral in terms of Riemann sums and upper and 
lower sums. To see the characterization of integrability in terms of a set of discontinuities 
having Lebesgue measure zero, see the Supplementary Materials. 


Theorem 6.8. Jf f : [a,b] > R is continuous then f is integrable. 


Proof. Since f is continuous on the closed and bounded interval {a, b], from Theorem 4.18, 
we know that f is uniformly continuous. Let « > 0. Choose 6 > 0 so that if |x — y| < 6 


then | f(x) — f(y)| < —a Let P = {20, £1, ...,2n} be a partition of [a,b] with |P| < 6. 


By the Extreme Value Theorem there are points s;,t; € [;-1,2;] for each positive integer 
i <n so that f(si) < f(x) < f(t) for each x € [a;-1,x,;]. Hence, U(f,P) — L(f,P) = 
n n 


2) — f(s:))(ai — ti-1) < 5 < ; > (a — x;-1) =e. Thus, f is integrable. 


Alternate proof using theorems from the Supplementary Materials: 


Proof. Since f is continuous, the set of points in the domain of f at which f is not continuous 
has Lebesgue measure zero, so by Theorem 7.73, f is integrable. 


b b 
Theorem 6.9. If f and g are integrable on [a,b] and s,t € R then i sf+tg= >| ft 
a 


a 
b 
a g. 
a 
Proof. For any partition P = {xo,71,...,v%} of [a,b] and oe P= Wee , x} of 
k 


P, the Riemann sum Sr(sf + tg) = Del ) + tg(az) = Lite y+ Dae = 


sSr(f, P) + tSr(g, Py: 
Let {P,} be a sequence of partitions of [a,b] so that {|P,|} — 0 and let T;, be a 


b 

marking of P, for each n € N. By Theorem 6.7, we know that {S7,(f, Pn)} > il f and 
b a 

{Sr,(g,Pn)} > ; g. Thus, by the product and sum theorems for limits of sequences, 


b b 
{Sr (sf +tg, Pr)} = {sSr,(f, Pr) +tSr, (9, Pn)} > >| fee f g. Thus, by Theorem 6.7, 


b b b 
we know that f sf +tg=s [ reef 
a a a 


Theorem 6.10. Let f : [a,b] — [c,d], where f is integrable on [a,b] and g is continuous on 
[c,d]. Then go f is integrable. 


Proof. Let « > 0. Choose M so that |g(x)| < M for all x € [c,d]. By Theorem 4.18, we 
know that g is uniformly continuous on [c,d]. We choose 6 > 0 so that if |x — y| < 6 and 
ty € [ed] then |9(x) — 9(y)l < 35 =: 


Choose a partition P = {z0,...,%} so that 


136 CHAPTER 6. INTEGRATION 


€ ons 
U(f,P) — L(f, P) < —. Then L = Ss" (xj — %-1) < Zar Since otherwise 
{ieN|M{—m{>5} 
U(f, P) — L(f, P) > £6 > —_. 
(f,P)-L(f,P) > 16> 2° 
If Mi = mi < 6 then let 7 > 0. By the Approximation Property we can find x,y € 
[zi-1, 2;] so that MS°! — g(f(a)) = ; and a(f(y)) —m9°t _ so MSF _ mgt < g(f(x))— 


gf) +7 < TEA + y (since | f(x) — f(y)| < 6). Since this is true for all y > 0, 

it follows that kad - mot < Te a Thus, it follows that U(g o f,P) — L(go f, P) 

= So (MR Hm mPa —aay+ SO (MIF = m9) (a; - a) < 
{ieN| Mf —m! >6} {iEN|Mf —mf <6} 

2M (=) +(b 55a) =e. 


Alternate proof using theorems from the Supplementary Materials: 


Proof. Let Ey = {x € [a,b]|f is not continuous at x}, and Egor = {x € |[a,b]|go f is not 
continuous at x}. By Theorem 7.73 we know (Ey) = 0, and we know that Eyop C Ef 
by Theorem 4.12, so A(Egor) = 0 by Theorem 7.60, so the result follows from Theorem 
7.73. 


Theorem 6.11. Let f,g : [a,b] > R be integrable. Then fg is integrable. 


Proof. First, we know that g(x) = 2? is integrable because it is continuous. Hence, by 


Theorem 6.10, we know that the square of an integrable function is integrable. Since 


1 
fo = -((f +9)? — (f — g)”), it follows that fg is a sum of integrable functions and is 
integrable by Theorem 6.9. 


Alternate proof using theorems from the Supplementary Materials: 


Proof. Let Ey = {x € [a,0]|f is not continuous at x}, and Ey = {a € [a,b]|g is not 
continuous at x} and let Ey, = {x € [a,b]| fg is not continuous at x}. By Theorem 7.73, 
A(Ey) = 0 = A(E,). By Theorem 4.7, we know that Er, C Ey U Ey. By Theorem 7.62, 
we know that \(Ey U E,) = 0, so by Theorem 7.60 it follows that \(Ey,) = 0. Thus, fg is 
integrable by Theorem 7.73. 


Theorem 6.12. First Mean Value Theorem for Integrals. Let f,g : [a,b] > R where f is 
continuous and g is integrable, and let g(x) > 0 for all x € [a,b]. Then there is a point 


b b 
c € [a,b] so that f Yio ede = fc) f g(x)dx. 


137 


Proof. First, integrability of fg follows from Theorem 6.11. By the Extreme Value Theorem 
there are points s,¢ € [a,b] so that f(s) < f(x) < f(t) for all x € [a,b]. Hence, since g(x) > 0 
b b 


it follows that f(s)g(«) < f(e)g(x) < f(i)g(x), so f(s) / g(a)dx < / f(ag(e)de < 


a a 
b 


b 
st) g(x)dx by Exercise 6.4. If | g(x)dx = 0 then the result follows for any c € 


b a 
d 
[a,b]. Otherwise, f(s) < Ja Ha)g(#)de < f(t) so by the Intermediate Value Theorem we 
J, g(a) dx 
b 
d. 
can find a point c between s and ¢ or equal to s or t so that f(c) = Ja He)g(a)ae so 
Jo g(a) dx 


[ reraayae= so [aoa 


Definition 34 


a b a 
If f is integrable on [a, b] then we define i f=- i f. We also define i) 10 
b 


a a 
for any function f defined at a. For an interval I = [a,b] we also use the notation 


frp 


Note that in the latter notation, we always assume that x f is an integral where the 


I 
lower limit (the subscript) on the integral is the least element of J and the upper integral 
bound or limit (the superscript) is the greatest element of J. Thus, if b < a and J is the set 


a 
of points between a and 6 or equal to a or b then fs = / Ts 
I b 


Theorem 6.13. (a) Let f : [a,b] + R be integrable, and let c € (a,b). Then / f(x)dx + 


[ soar - [ feo. 


c b 
(b) If f ts integrable on an interval containing a,b and c then _ flajae+ | f(ajdz= 


[ seo. 


Proof. (a) Let « > 0. Then we can find a partition P of [a,b] so that U(f, P) — L(f, P) < «, 
and setting Q = PU{c} we have U(f,Q)—L/(f,Q) < e. If we define P} = QN[a,c] and P2 = 
Qnile, b] then note that UL, Q)-L(f,Q) = (U(f, PHL, Py) HOE Py)—L(f, P»)) <€, 


c b 
so U(f, Pi) —L(f, Pi) < «and U(f, P2) —L(f, Po) < «. Thus, : f(a)dx and i: f(a) exist. 


138 CHAPTER 6. INTEGRATION 


Since fre fanfs E (L(f,Q),U(f,Q)) it follows that far fae fa og 
Since this-is*tnue forall 62s 0:we oondlude thet i] ee "y= / p 

wytacd<etnen fost firs fp fre fir-forq firs fires 
definition. Similarly, if c < a < b then [r+ fre fs so [s-fr-f[or- 
[s+ ft 

fn theense Suche Pesan ha +f f 4 Ls -{ i -{ ee 
o= fs. wometnen [7+ fF Si fre 0+ fs = fs. he pahea 
ffs) ea) pa) ete) 7 


Theorem 6.14. Let f : [a,b] > R be integrable. Let c € [a,b] and let F(a ae f(t) 
Then F is (uniformly) continuous. 
Proof. Choose M so that |f(x)| < M for all x € [a,b]. Let x € [a,b] and let « > 0. Choose 


y 
(= If |a—y| <6 and z,y € [a,b] with y < x then |F(y)— F(xz)|=| | f| by Theorem 


y y y 
6.13. However —M < f(x) < M for all x € [a,b] so / —-M< / f< : M by Exercise 
x zx x 


6.4, which means that —M6 < F(y) — F(x) < Mo. Hence, |F(«x) — F(y)| < M = 


Thus, F is uniformly continuous. 


Theorem 6.15. Fundamental Theorem of Calculus (first form). Let f : [a,b] > R be 
continuous, let c € [a,b] and let F(x) = f(x)dzx for all x € [a,b]. Then F'(x) = f(x) if 
a<a<b. The function F is also continuous at a and b. 


Proof. First, continuity follows from Theorem 6.14. If a = b then the result is vacuously 
true. Assume a < b and let x € (a,b). If h has small enough absolute value then 


a Fle th) - F(a) _ fo" f(a)dx 
a h 


x+h€ (a,b), an by Theorem 6.13. By the First Mean 


h 
ath 
Value Theorem for Integrals, there is some c, between x and x + h so that fee = 
eth _ F(e+h)—F(a)_,, f?* f(x)dz hf (en) 
flex) f ldx = hf (cp). Hence, lim 1h = lim a lim FS 


x 
lim f(cn) = f(x) since f is continuous and lim Ch = x by the Squeeze Theorem. Hence, 
> > 


F"(x) = f(z). 


139 


Theorem 6.16. Fundamental Theorem of Calculus (second form). Let F be a function so 
that F'(x) = f(a) for all x € [a,b], where f(x) is integrable on [a,b]. Then [see = 
F(b) — F(a). “ 

Proof. Let € > 0 and choose a partition P = {20,21,...,2n} of [a,b] so that U(f,P) — 
L(f,P) < ¢. We know that F is differentiable (and thus continuous) on each interval 


[v;-1,2;], so by the Mean Value Theorem, for each positive integer i < n we can pick 
x; € (aj-1,2;) so that F’(x7)(2; — aj1) = F(a) — F(xi-1) = f(x;)(a; — 2;-1). Thus, 
nr 


with marking T = {2j,...,2;,} we have Sr(f,P = oie =G4) = S > F(x) — 
b 
F(a;-1) = F(b) — F(a). Since i, f, F(b) — F(a) € [L(f, P), U(f, P)], it must follow that 


b 
ifs — (F(b) — F(a))| < U(f, P) — L(f, P) < «. Since this is true for all € > 0 it follows 


that fF = FO) - F(a). 


The following is essentially a slight generalization of the preceding theorem. The function 
F need not actually be differentiable at a and 6 as long as it is continuous at those points, 
and we could have b < a or even b = a and the theorem above would still be true. The 
proof is essentially the same as that of the Fundamental Theorem of Calculus second form, 
with minor adjustments. 


Theorem 6.17. Let F' be a function so that F'(x) = f(x) for all x between real numbers a 
and b, so that F is continuous ata and b, where f(x) is integrable on the interval consisting 
b 


of points a and b and all points in between those points. Then iy f(x)dx = F(b) — F(a). 


Proof. First, assume a < b. Let € > 0 and choose a partition P = {20,21,..., 2} of [a,}] 
so that U(f, P) — L(f, P) < «. We know that F is differentiable on each interval (21, 7;) 
and continuous on each interval [x;_1, 2;], so by the Mean Value Theorem, for each positive 
integer i < n we can pick 2} € (2;-1,2;) so that F’(x7)(x; — xj-1) = F (wi) - — F(a;-1) = 


f(a7)(a; — v;-1). Thus, with marking T = {27j,...,27} we have Sr(f, P -y hace 


a 


n 


b 
nia) =) Flas) ~ Flea) = FQ) - Fla). Since [ f,F)- F(a) € LUF,P),UU,P) 


i=1 


it must follow that if f —(F(0) - F(a))| < U(f, P) — Lf, P) < e. Since this is true for 


all € > 0 it follows that a f = F(b) - F(a). 


140 CHAPTER 6. INTEGRATION 


a b a 
If b < a then we know that / f = F(a) — F(b), so by definition | i= -{ fie 
b a b 
F(b) — F(a). Likewise, if a = b then / f =0 = F(a) — F(a) so the result is true for all 
potential orderings of a and 6. . 


We will not refer to the preceding theorem when we use the second form of the Fundamental 
Theorem of Calculus, but will just refer to the Fundamental Theorem of Calculus itself. Note 
that in the statement of the second form of the Fundamental Theorem of Calculus, f is 
listed as being integrable, but we also often see this theorem stated with f being continuous 
rather than just integrable (and we might see F’ simply as being differentiable on all of 
(a, b] instead of just on the interior). If we refer to f as being continuous, then the second 
form of the Fundamental Theorem of Calculus is a direct consequence of the first form of 
the Fundamental Theorem of Calculus, which helps us understand why both theorems are 
called the Fundamental Theorem of Calculus. Here is how an argument would go for that 
version of the theorem. 


Theorem 6.18. Fundamental Theorem of Calculus (second form, continuous case). If 
F'(x) = f(a) for all x € [a,b|, where a < b, and f(x) is continuous on [a,b], then 


b 
[ tear = FO) - Fo) 


Proof. We know f is integrable by Theorem 6.8. By the Fundamental Theorem of Calculus, 
first form, if we set G(x) = i f(t)dt then G’(x) = f(z) for all x € (a,b) and we also know 
that G is continuous at a and 6 by Theorem 6.14. It is also true that G(b) — G(a) = 


b 
iy f(x)dx. Since F’(x) = G’(x) on (a,b), and F and G are both continuous at a and 8, 


b 
we know that G(x) = F(x) +k for some constant k by Theorem 5.20. Thus, ‘| fede = 
G(b) — G(a) = (F(b) +k) — (F(a) +k) = F(b) — F(a). : 


The following is the typical form of u-substitution that is used in most calculus classes. 


Theorem 6.19. Basic u substitution. Let f and g be continuously differentiable functions 


b g(b) 
so that [a,b] C dom(f og). Then ‘| f'(g(2))9'(w)de = if ig 1 led = F910) ~ Fale). 
a g(a 


Proof. Since f,g are continuously differentiable, we know that f’(g(x))g’(x) is continuous 

and thus integrable. Since g is continuous, by the Intermediate Value Theorem, we know 

that all points between g(a) and g(b) are in g({a,b]) and hence in the domain of f. By 
g(b) 

the second form of the Fundamental Theorem of Calculus we know that f'(u)du = 

g(a 


) 
f(g(b)) — f(g(a)). Likewise, by the chain rule we know that (f 0 g)/(x) = f’(g(xz))q/(a), 


b b 
so by the Fundamental Theorem of Calculus we know that f f'(g(z))g'(x)dx = / (fo 


a 


g(b) 
9) (w)dx = f(9(b)) - f(g(a)) = / 1 Pwd 
g(a 


Theorem 6.20. Integration by Parts. If f and g are continuously differentiable on [a,b] 


then [ fo = tg f gf and ‘i Fe =A) = i. fe 


b b 
Proof. First, note that by the product rule (fg)' = f’g+9'f so i (Fg) = 7 f'g+q'f, so 


i 7 =o i gf’. Using the second form of the Fundamental Theorem of Calculus and 


b b 
Theorem 6.9, we have f(b)g(b) — f(a)g(a) = / f'g+g9'f, so if fg’ = f(b)g(b) — f(a)g(a) - 


a f'g. 


It is perhaps worth mentioning that keeping track of iterated uses of integration by parts 
is easier by way of using a table. You put the factor to be differentiated on the left, and 
the factor to be integrated on the right, and list iterated derivatives below the factor to 
be differentiated and iterated antiderivatives below the factor to be integrated. You then 
connect the left column entries with diagonal lines moving one row down to the right column 
entries representing multiplication, and put a plus or minus sign above the connecting line 
segments, starting with plus and alternating. Then when you reach a point in the table 
where you could multiply horizontally and get a product which is integrable, you make one 
final sign alteration on the last connecting (horizontal) line representing that product. You 
then write the sum of the indicated signs times the products indicated by the diagonal lines 
and add (or subtract depending on the sign that has been assigned) the final integral, which 
lets you leave the table and finish the problem with an integral. 


Example 6.2. Evaluate | o8erae. 


Solution. Using the usual table (below) with x? to be differentiated until it becomes zero 
in the left column and e?* to be integrated in the right column we end up integrating 
(0)(e2") to just get the constant of integration at the end, so the indefinite integral is 
3 22 dy = 1 3,20 O oee \ 3 et 3 2a ay 


4 8 


142 


CHAPTER 6. INTEGRATION 


Taylor’s series are one of the most helpful ways to analyze a function. This is based on 
integration by parts and induction as described below. 


Theorem 6.21. Taylor’s Theorem (first form). Let n be a non-negative integer, and let 
f(x) be continuous on the closed ae: whose end points are x anda for natural numbers 


(a) (a — 
i<n+1. Then f(x =p re eae: raf romeo a —t)"dt. 


Proof. We proceed by induction on n. For the n = 0 case, we note that "he f' (dt = f(x)- 


f(a) by the Fundamental Theorem of Calculus, so it follows that f(x 3 ee ee); | 


1 

0! 
FF@ _ 
i=0 : 


“+af fENe 


— f(k+) (t) (x — tet |" 


— / fOU(\(e— ae Assume the result is true for n = k, so 


— t)Pdt. Then, using integration by parts, 


eel 


yo Geet wena" 


=f fO2)(t)(e—2) Mat, so f(a) = 


i! ' kl 


k+1 


rae) 
k+1 
i sola 
fF (t)(a— t)F+1 ge] = 
al » 


1 x 
| f'**2)(t)(a — t)**1dt as desired. Thus, the result follows for all natural numbers 


Theorem 6.22. Taylor’s Theorem (second form, Lagrange’s error bound). Let n be a non- 


negative integer, and let fO 


(x) be continuous on the closed interval I whose end points are x 


and a for natural numbers i <n+1. Then f(x) = 


for some point cE I. 


nm ¢(i) —a) 

Proof. Using the first form of Taylor’s Theorem we know f(x) = Dy a_i + 
i! 

i=0 


1 x 
— =f f'"*) (t)(x—t)"dt. Using the First Mean Value Theorem for integrals, since f'"t is 


continuous on I we know that for some point c € I it follows that — af f*) (¢)(a—t)"dt = 


a fon (c) [ w-ora= f 


ntl) (e ica qjerl 
(n+ 1)! 


Definition 35 


We say that a function f is C” on an open set U if f*(x) is continuous for all 
positive integers i < n. We say that f is C° on U if f‘(x) is continuous for all 
positive integers 7. If U is the domain of f then we simply refer to f as C” or C™ 
instead of C” on U or C™ on U. 


Definition 36 


n 
If {z,} is a sequence then the nth partial sum of this sequence is s, = > 
i=l 
The sequence {s,,} is the series consisting of the sequence of partial sums s,, and is 
[o-e) [oe) 


also denoted as Se Zn. We also use Se Zp, to refer to the point to which this series 
= rol 
converges, depending on context. 


Theorem 6.23. Let f : (c,d) > R be a C™ function so that ae fo O@= 


t)"dt} > 0 for some a and x in (c,d). Then f(x =p Orne . Furthermore, 
: 1 a =a)" ‘aie 
if kn > . 7) for all x between a and x and { Cray \ + 0 then f(x) = 


ee (x — a)’ 


144 CHAPTER 6. INTEGRATION 


‘aw (i) ee x 
Proof. By Theorem 6.21, we know that > u tale a) + : | feU@(a — £)"dt — 
n! Ja 


a 
i=0 


it x 
f(x) = 0, which means that if  f f° (t)(e — t)"dt} > 0 then it follows that 


mn eli) — ai ae =a) 
ay ‘i wa a) f(a)} — 0 by Exercise 3.8, which means that {y, f ta a) \> 
var l i=0 ; 

F(z). 


By Taylor’s Theorem (second form), for some c between x and a or equal to x or a 


(n+1) —a)” * 
-_ oo = = F [ FOOD (a — tat. Since ky > |f"*(a)| for 


all x between a and x (and thus at a and z as well since f”*+ is continuous), it follows 
(n+1) _ ,\ntl — 4)nt+l — ,\ntl 
fOrN(g(e— ay? haem a) op hnle— a) 
(n+1)! (n+ 1)! (n+ 1)! 
1 av 
nl / FTV (t)(x — t)"dt — 0 by the Squeeze Theorem, so by the argument above, f(x) = 


we know that 


that | } — 0, it follows that 


One of the many things we can illustrate with Taylor’s series is the process called 
Newton’s Method. 


Newton’s method a means of quickly approximating solutions to equations with a high 
level of accuracy. It is based on the notion that near a zero of a function, its tangent line 
will have an x-intercept close to the zero (since the curve and the tangent line should both 
be moving in approximately the same direction near a given point). Since it is easy to find 
the zero of the tangent line function, we can use this as our new starting point, take a 
tangent line at the point on the curve with x value equal to the zero of the tangent line, 
determine where the new tangent line to the curve intersects the z-axis, and repeat the 
process, becoming closer and closer to the zero. 


To derive the formula for Newton’s method, we begin with a guess at an initial approximation 
x, to a zero of a function f(x). We then take the tangent line to the curve y = f(z) at 
(x1, f(@1)). The slope of the tangent line is f’(7,), so the tangent line is y — f(x) = 


f'(a1)(x — x1). Setting f!(x1)(x — 21) + f (v1) = 0 we get x = 2 — f (x1) 


f'(1) 
process, we motivate the following definition. 


. Repeating this 


Definition 37 


Inductively, if x, is the nth Newton’s method approximation to a zero of a 


differentiable function f(x) containing zx, in its domain, then we define the n + 1st 


f (an) 
oa) 


approximation to be %n41 = Ln — 


145 


Note that in this definition, we did not define the initial guess z;. That is a guess 
which can be made based on something like the Intermediate Value Theorem, just a stab 
in the dark at where a zero might be, or by looking at the graph and trying to eyeball 
around where a zero might be. For purposes of this process, we do not have an established 
algorithm for the primary guess. Below is a picture of showing how Newton’s method works 
for the specific example of trying to estimate a zero of f(x) = x —5. This is not interesting 
in terms of showing Newton’s method’s strength for estimating roots quickly since we know 


that the only zero of the function is 5=. However, since we know the solution, this does 


help us to compare the estimates to the approximations. 


Example 6.3. Starting with an initial approximation of x1 = 1 to the zero of f(x) = ah, 
find x2 and x3 using Newton’s method and approximate the cube root of five. You may use 
decimal approximations to your answers. 


Solution. We begin with an initial guess, 71 = 1. This estimate was picked solely on the 
basis that we like the number 1 (pretty much a wild guess). Sure enough, 1 is not a zero 


of the function. We take f’(x) = 32. Plugging into the formula for Newton’s method we 
B_5 343 _ 135 


7 
get v2 =1- 302) ae To get the next estimate we take x3 = oo eae A decimal 


approximation to this estimate is 1.86. This is still fairly far from the imié value of the 
cube root, which is about 1.709975947. However, if we do one more iteration we will get 
about 1.72. So, have about four iterations we have become quite close to the true value of 
the root. One more iteration gives us 1.710065. This is still off, of course, but it is only 
off by about one ten thousandth, so it is pretty close. Doing one more iteration gives us 
1.709975951. 


146 CHAPTER 6. INTEGRATION 


Newton’s Method for f(x) = 2° — 5 


7 208 
S27 


( ) 


Notice that the decimals all agree until the eighth place past the decimal. Here is 
a picture for the first two steps of the estimate. The initial estimate was pretty far off, 
and the derivative was fairly small, so the second estimate wasn’t great, but by the third 
estimate the estimates were starting to be pretty close to the cube root of five. Newton’s 
method works better when the derivative is not close to zero at the approximation point, 
particularly if the associated function value at the approximation point is large. If the 
derivative actually is zero at an approximation then you can’t use Newton’s method at 
that point (you would be dividing by zero). There are also anomalies where functions can 
actually bounce back and forth between two estimates indefinitely using Newton’s method 
and never get you any closer to the actual zero. Also, if you pick a point that isn’t close 
to the zero you are trying to find, you may wall get a sequence of approximations that 
converges to a zero of the function which is a different zero from the zero you were hoping 
to approximate (assuming the function has multiple zeros). However, most of the time this 
method will estimate a zero quite accurately and quite quickly, making it much better than 
the other methods we have discussed up to this point for estimating zeroes of functions. If 
we pick an initial estimate near the zero we want, then Newton’s method will generally give 
approximations converging to that particular zero. What we want, more specifically, is for 
the second derivative to be reasonably small and the first derivative to be reasonably large, 
and to start out at an approximating value which is near the zero we hope to find, in which 
case we approach the zero rapidly. Here is a more specific error bound. 


147 


Theorem 6.24. Error bound for Newton’s method. Let f, f’, f” be continuous on an 
interval I containing &p, Xn41 andr, where Lp and p41 are the nth and n+1st approximations 
using Newton’s method for the zero r of the function f(x). If L,M >0 and for all x € I it 


M 
is true that |f’(x)| > L and |f"(x)| <M, then it follows that |an41—r| < ap ltn r|?. If 
1 2D 1 5n- 
yA —rl< 5 then |tn —T| < ios . 


Proof. By Taylor’s Theorem with the second (Lagrange’s) remainder formula we know that 


" 2 
f(x) = flan) + f'(an)(z — an) + R on the interior of I, where R = FOM@ = tn)" for 


2! 
M 
some c between x and xy. Hence |R| < ale —zpy|*. If ¢ = r then we get 0 = f(r) = 
/ = R = _S a — th 
f(@n) + f (en) (r — Zn) + R, i In fi(@n) Fi(@n) MCE In+1 = Ln (tn) 1S 
R 

that = = - 
gives us that |rp,41 —7| Fn) = ap len r| 

1 

In the case where yan . 5? this means that |v2—r| < apintl S (=)(|a1—r|) < 
£1 2D 1 2D. 1. ok- 
vay = rvACt Proceeding by oe assume ae rl < PG)” *. Then 
< 2 a ae" The result fol 

we have |zz41 —1r| < Aa r| TASTE (5) = 54 (9) e result follows 


by induction. 


The first form of the Fundamental Theorem of Calculus makes certain function derivatives 
more accessible than they would be without integration. 


Definition 38 


Define In(a re —dt for all x > 0. 


Note that In() exists by Theorem 6.8. 


Theorem 6.25. For all x,y >0 andr € Q the following are true: 
1 
(a) In(x)! = — for each x > 0. 
Be 


(b) In(a) is Hee and In(1) = 0. 


( 
(¢} In(wy) = In(x) + In{y) 
(d) i In(x) — In(y) 
(e) n(x") = rln(z) ifr EQ 


(f) In(a) has no lower bound or upper bound. The range of \n(x) is R. 
(g) There is a unique number which we will refer to as e so that In(e) = 1 


148 CHAPTER 6. INTEGRATION 


(h) We define exp(x) to be the inverse of In(x). Then exp(x) is increasing on R and 
(exp(z))’ = exp(z). 
(i) Ifr ==, where p is an integer and q is a natural number then e" = exp(r). 
qd 


Proof. (a) This follows from the Fundamental Theorem of Calculus (version 1). 

(b) In(z) is increasing by Theorem 5.18 since (In(x))/ = . > 0, and In(1) = [ vi =) 
by definition. * ? 

(c) Note that (treating y as constant), (In(xy))! = = - = (In(z))’.. Thus, by 
Theorem 5.20, In(a) + & = In(xy) for some constant k. Setting x = 1 we have k = In(y). 

(d) By (c) we know that 0 = In(1) = int) = In(y) +m"), so In) = —In(y). Hence, 


tae = tna in() <li) in). 


—a~F 


e)r= P for some integers p and q, with q > 0. Inductively, we note that In(x!) = n(x) 
qd 


and if In(x*) = kln(x) then In(a**!) = In(a*x) = In(x*) + In(x) = kln(x) + In(x) = 


(k + 1)In(x), so for any natural number n it follows that In(z”) = nIn(x). Likewise, 
1 
In(a) = In( (an )”) = nin(ar), so In(xr) = —In(x) for any natural number n. If m is 
1 
a negative integer then In(x) = In(—_) = —(—mIn(z)) = min(z). If r = 0 then 
£ 


infa? ) Sle) == Flee), 
1 
Combining these, we have that In(x*) = plates) = 7 ina) 
(£) By (b) we know In(2) > In(1) = 0. Since the natural numbers are not bounded above, 


for any M > 0 we can find a natural number k so that k > , so k1n(2) > M, and thus 


M 
In(2) 
In(2*) > M, which means In(z) is not bounded above. Similarly, In(2~*) = —kIn(2) < —M, 
so In(x) is not bounded below. 

For any z € R there are, thus, a,b > 0 so that In(a) < z < In(b) and hence, by the 
Intermediate Value Theorem, there is some point c between a and 6 so that In(c) = b, so 
every real number is in the range of In(z). 

(g) By (£) we can find a point e so that In(e) = 1. This is the only number whose natural 
logarithm is one because In(x) is increasing by (b) and therefore one to one. 

(h) We know that exp(z) has a domain of all real numbers because R is the range of 
In(a) by (f). By the Inverse Function Theorem, exp(zx) is increasing and differentiable. 
Hence, setting y = exp(x), we know In(y) = x where y is differentiable, so by the chain rule 
ag = 1, which means that y = y’ and exp(x) = (exp(z))’. 

(i) Note that In(e") = rln(e) = r by (e), and In(exp(r)) = r by definition of inverse. 
Since In(a) is one to one, it follows that e” = exp(r). 


Definition 39 


Let x > 0 and let ae R\Q. Then we define x* = exp(aln(z)). 


149 


Theorem 6.26. (a) e* =exp(x) for allx eR 
(b) For all x € (0,00), (a ey =rg"! 
(c) For each r > 0, (r*)! = In(r)r” 


Proof. (a) By definition, e” = exp(zx In(e)) = exp(x) 


(b) (a’)' = (e rin(z)y/ - ee = aa = rz"! by the chain rule (and Exercise 6.12). 
) 


(a 
(x 

(c) Using the chain rule, we know that (r”)’ = exp(zIn(r))’ = In(r) exp(zIn(r)) = 
In(r)r*. 


The Fundamental Theorem of Calculus shows us how an antiderivative can be used to 
evaluate an integral, but there are some differences between the idea of an antiderivative, 
the most general antiderivative, a definite integral and an indefinite integral. 


Definition 40 


We say that g(x) is an antiderivative of f(x) on (a,b) if g'(x) = f(x) for all x in 


(a,b). 


Note that, by Theorem 5.20, if g(a) and h(a) are antiderivatives of a function f(x) on 
an interval J then g(x) = h(x) + ¢ (for some constant c) on the interval J. 


Definition 41 


We refer to a collection of functions C as the most general antiderivative of a 
continuous function f(x) if every antiderivative of f is an element of C. The notation 


/ f(x)dx = g(x)+C means that the set of all functions g(x)+C so that C € R is the 


indefinite integral of f(a). This means that, on every open interval (a,b) C dom(f), 
the most general antiderivative of the function f restricted to domain (a, b) is the set 
of all functions of the form g(x) +C so that C ER. 


Note that the indefinite integral is not necessarily the most general antiderivative of f, 
only the form of the most general antiderivative of f on each open interval on which f is 
continuous. 

For functions which are continuous on a domain consisting of a finite set of mutually 
exclusive open intervals, the most general antiderivative is obtained by taking an antiderivative 
over each such interval on which f is defined and adding different constants. This is 
illustrated in the following example: 


150 CHAPTER 6. INTEGRATION 


1 
Example 6.4. Find the most general antiderivative of f(x) = =. 
Zz 
‘ ‘ Sas = ee oe ‘ : 
Solution: First, we note that the derivative of — is —5. Since f(x) is continuous on 
£ x 


1 1 
(0,00) and on (—oo, 0) we conclude that F(a) = =. +C; when z > 0 and F(x) = | +C 


when x < 0 is the most general antiderivative of f(a) (where the constants need not be 
equal). In other words, every antiderivative of f(x) is of the form stated for F(z). 


1 —l 
This is a picture of the graph of an antiderivative of —5. The function — + C is the 
x 
indefinite integral, so we could add different constants on each of the two components of 


1 
the domain of —. The particular antiderivative we will graph is g(x) = —2— — if <0 
£ a 


1 1 
and g(x) = 1—-— if a > 0. The blue dashed graphs represent the function f(x) = —— for 

x x 
comparison. Notice that the slopes at every point are the same as the slope of the graph of 


1 1 
y = —, so the derivative is the same as the derivative of y = —. 
oH ae 


1 
Antiderivative of f(x) = 5 
x 


“”~ 


81] ¥ 
TI} 
all 
4i + 


Using the sum rule for derivatives and working backwards we get that if g1, go,..,gn are 
continuous functions with antiderivatives fi, f1,..., fn, on each open interval in the domain 


151 


of G(x) = aygi (x) + agge(x) + ...Angn(x), then fant) + a2go(x) + ...Angn(x)dx = 
4 
ai fi(x) + a2 fo(x) + ...anfn(x) + C. For example, ie +42 + 9dr = 5 LD? 4 Op tC, 


JR 


1 
The most general antiderivative of 7 is the set of all 
) =Injz| + Cy ifx > 0 and F(x) = In|z| + C, ifx < 0, 


Theorem 6.27. Let f(x) = 


functions F(x) of the form F( 
where C1,C2 € R. 


1 
The indefinite integral i —=In|z|+C. 
x 


88 


1 
Proof. If x > 0 then In(a) is defined and we know that (In(a))’ = —, which means that 
every antiderivative of f(x) restricted to any open interval (a,b) C (0,00) has the form 
In(a) + C for some constant C by Theorem 5.20. If « < 0, however, then = still exists, 
1 1 
and In(—z) is defined. Using the chain rule we see that (In(—a))/ = —-(—1) = —. As 
—£ oy 
before, for (a,b) C (—oo,0) each antiderivative of f(x) restricted to (a,b) has the form 


In(—x) + C = In|z|+C. we have [oe = In(—x) + Cg. Since —x = |x| if x < 0 we can 
x 


1 

consolidate this notation by saying that the most general antiderivative of — is In|x| + C1 
x 

if ¢ > 0 and In|z| + C, ifa <0. 


1 
Note that in the preceding example we demonstrated that i, —dx = In|jz|+Cisa 
x 


correct formula for the indefinite integral (whether we consider the indefinite integral to be 
the general antiderivative on an interval wherein x > 0 or an interval where x < 0). 
The difference between ” most general antiderivative” and ” indefinite integral” is also not 


1 
always immediately clear. So, i —dx = In|x| + C means that if we took an open interval 
x 


(a,b) contained in the domain of f(x) = —, which would have to be a subset of either 


(—co,0) or (0,00) since the natural log of zero is undefined, then the set of all functions of 
the form In|z| + C for C € R would be the most general antiderivative for f if the domain 
of f were restricted to (a,b). This does not mean that the most general antiderivative of 
f on its entire domain is the same set of functions, only that if we restricted f to an open 
interval contained in its domain then all antiderivatives of f on that open interval would 
be of that form. If (a,b) C (0,co), for instance, then all antiderivatives of f(x) on (a,b) 
would have form In(#) + C = In|z| + C. If (a,b) C (—oo,0) then all antiderivatives of f 
would have form In(—2) + C = In|z| + C for some constant C. However, an antiderivative 
of f on its entire domain (not an open interval subset of it) could have form In |x| + C, if 
x > 0 and In|2| + C2 if x < 0, where C; and C2 are different constants (so the indefinite 
integral does not include all antiderivatives of the function in this case). When the domain 
of f is a connected interval then the indefinite integral and the most general antiderivative 
of f are the same thing. When the domain is a union of disconnected intervals then a 
different constant could be added to the antiderivative on each component of the domain 
and still result in an antiderivative for the function, which means that that most general 
antiderivative of f(x) and the indefinite integral are not the same thing in that case. 


152 CHAPTER 6. INTEGRATION 


Exercises: 


Exercise 6.1. Let f : [a,b] — R be integrable and let F(x ee f(t)dt. Then F is 


integrable. 


1 
Exercise 6.2. Let f(x) =0 ifx 40 and f(x) =1 if x =0. Prove that | fle\dx = 0; 
1 


Exercise 6.3. If f : [a,b] > R is bounded and has only finitely many discontinuities then 
f is integrable. 


Exercise 6.4. (a) Let f,g be integrable on [a,b] and let f(x) < g(x) for all x € [a,b]. Then 
b b 
l sla)de < f g(e)ae, 
* (b) Let f be integrable on [a,b]. If m < f(x) < M for all x € [a,b] then m(b— a) = 


[nae s [re yar < f° Mar =a (b—a). 


Exercise 6.5. Let f : [a,b] > R be continuous and non-negative. Prove that if f(c) > 0 
b 


for some c € [a,b] then / f(x)dx > 0. 
Exercise 6.6. Let f : [a,b] > R be monotone. Then f is integrable. 


Exercise 6.7. Let gi(x), g2(x) be differentiable on [a, bj, let f(a) be continuous on [a,b] and 


2 (a) 
let F(x) = . ' f(t)dt. Then F(a) = f(g2(x))95(x) — f(g1(2)) 91 (2). 
gilx 


1 
Exercise 6.8. Show that lim (1+ —)" =e. 
N—-+ Oo 


Exercise 6.9. Let f,g : [a,b] > R be integrable functions. Then fg is bounded. 


x —2x LZ 4-a@ 
Exercise 6.10. Define a. = cosh(xz) and sie 


cosh(x) and (cosh(z))/ = sinh(z). 


= sinh(x). Then (sinh(x))! = 


153 


Exercise 6.11. Find, with proof, an example of a function f(x) which is integrable on [a, | 
a 


so that F(x) = / f(t)dt is not differentiable. 


a 


Exercise 6.12. Prove that if c > 0 and a,b € R then: 
(a) ce? = *?, 
CF a—b 
(6) ieee 


(c) (c*)? ~ coh. 


Exercise 6.13. Prove that if {x} is a sequence of positive numbers converging to a real 
number r and c > 0 then {c*"} > c". 


Note that, as a result of this exercise, if we were to define exponents at irrational numbers 
as the limits of exponents at the first n digits of the decimal expansions of those numbers, 
then the definition of raising a number to an irrational number would be equivalent to the 
definition given above. 


b 
Exercise 6.14. Let f : [a,b] > R be integrable. Then |f| is integrable, and | f(a)dz| < 
a 


‘i "[p(@)lae. 


154 CHAPTER 6. INTEGRATION 


Hints: 


Hint to Exercise 6.1. Let f : [a,b] > R be integrable and let F(x i f(t)dt. Then F 


is integrable. 


Continuous functions are integrable. 


Hint to Exercise 6.2. Let f(x) = 0 if x 4 0 and f(x) = 1 if x = 0. Prove that 


[. f@ide =O. 


For a given € > 0, find a partition so that the upper sum over the partition is € and the 
lower sum is zero. 


Hint to Exercise 6.3. If f : [a,b] > R is bounded and has only finitely many discontinutties 
then f is integrable. 


If the supplementary material was covered, use the Lebesgue Characterization of Riemann 
Integrability. Otherwise, find upper and lower sums within distance € of each other by 
picking a partition with short subinterval rectangles about each point of discontinuity. 


Hint to Exercise 6.4. (a) Let f,g be integrable on [a,b] and let f(x) < g(x) for all 
b b 


x € [a, bj. Then f f(a g(x)dx. 
(b) Let f be integrable on [a,b]. Ifm < f(x) < M for all x € [a,b] then m(b—a) = 


[macs [se yar < f Mae = M( b—a). 


Use the Riemann sum characterization of integral (Theorem 6.7). Take a sequence of 
partitions with mesh converging to zero, use any markings you wish and then compare the 
Riemann sums for f and g using those markings. 


Hint to Exercise 6.5. Let f : [a,b] > R be continuous and non-negative. Prove that if 
b 
f(c) > 0 for some c€ [a,b] then / J ede 0. 
a 
Since f is continuous, it is possible to guarantee that f is larger than some positive 


value on some open interval centered at c. Find a partition with a subinterval containing c 
which is sufficiently small. 


Hint to Exercise 6.6. Let f : [a,b] > R be monotone. Then f is integrable. 


Recall that the supremum on a subinterval induced by a partition is always the function 
value at the right end point for an increasing function, and the infimum the value of the 
function at the left end point. What form does the upper minus lower sum take? 


155 


Hint to Exercise 6.7. Let gi(x), g2(x) be differentiable on [a,b], let f(x) be continuous 
g2(x) 

on [a, 6] and let F(x) = / is f(t)dt. Then F"(a) = f(g2(x))9o(x) — f(g1(2)) 91 (2). 
gi(x 


Combine the Fundamental Theorem of Calculus (first form) with the chain rule. 


1 
Hint to Exercise 6.8. Show that lim (1+ —)”" =e. 
Noo n 


You could also proceed by taking the log and applying Theorem 4.11, or you could use 
the definition of In(a) and the definition of e directly. 


Hint to Exercise 6.9. Let f,g : [a,b] + R be integrable functions. Then fg is bounded. 


Recall that integrable functions are bounded. 


Hint to Exercise 6.10. Define — = cosh(x) and eS sinh(z). Then 
(sinh(a))/ = cosh(a) and (cosh(x))! = sinh(z). 


Use the chain rule. 


Hint to Exercise 6.11. Find, with proof, an example of a function f(x) which is integrable 


on [a,b] so that F(x) = | f(t)dt is not differentiable. 


The function f would have to be discontinuous. Look at functions with a jump discontinuity. 


Hint to Exercise 6.12. Prove that ifc > 0 anda,b € R then: 
(a) ce? = ct, 
ce a—b 
(6) Gere 


(c) (c*)? a ca. 


Take the logs of both sides, and recall that In(x) is one to one. 


Hint to Exercise 6.13. Prove that if {x} is a sequence of positive numbers converging 
to a real number r and c > 0 then {c’"} > c’. 


Start by taking the natural log of the sequence terms and use Theorem 4.11. 


Hint to Exercise 6.14. Let f : [a,b] > R be integrable. Then |f| is integrable, and 
b b 
Lf pe)ael < f peace. 


Try using theorems 6.10, 1.16 and 6.4. 


156 CHAPTER 6. INTEGRATION 


Solutions: 


Solution to Exercise 6.1. Let f : [a,b] — R be integrable and let F(x) = / f(t)dt. 
Then F is integrable. : 


Proof. By Theorem 6.14, we know that F is continuous, which means that F' is integrable 
by Theorem 6.8. 


Solution to Exercise 6.2. Let f(x) = 0 ifx £4 0 and f(x) = 1 ifx =0. Prove that 
1 
‘i f(x)dz = 0. 
-1 
Proof. Let €« > 0 and let P = fate a Ue Then L(f,P) = 0 and U(f,P) = =, so 
U(f, P) — L(f, P) < « and f is integrable. Since L(f,Q) = 0 for every partition Q it must 
b b 
be the case that (Z) f = / feo. 


Solution to Exercise 6.3. If f : [a,b] > R is bounded and has only finitely many 
discontinuities then f is integrable. 


Proof. Since finite sets have Lebesgue measure zero by 7.61, f is integrable by Theorem 
7.73. 


OR without the supplementary materials: 


Proof. Let € > 0. Let a < a1 < 2 <<... < 2m < 0, where D = {21,2%2,23,...,Lm} is 
the set of points at which f is discontinuous. Since f is bounded we can choose M > 0 
so that |f(x)| < M for all x € [a,b]. Let Q = {qo,u,4q,.--,Qn} be a partition of [a, b] 
Let kK = La (G-1,%]. Then K is 
{i€{1,2,...,.r}|[Gi—1,4]ND=O} 
contained in [a,b] and is bounded, and K is the union of finitely many closed intervals 
so K is closed. Since f is continuous at every point of K, we know that f is uniformly 
continuous on Kk by Theorem 4.18, so we can find a 6 > 0 so that if |x — y| < 6 and 
ny € K then |f(x) — FW) < ap=y 
[a,b] which is a refinement of Q with mesh less than 6. At most two subintervals induced 
by P can contain any discontinuity x2;, which means that no more than 2m subintervals 
induced by P intersect D. U(f,P) — L(f,P) = Ss" (M; — m;)(pi — 
{i€ {1,2,...,4}|[pi—1 pi] ID =O} 
pi-1) + De (M; — mi)(pi — pi-1). Since S- (M; — 
{iE {1,2,...,4}|[pi-1,pis]JN DAO} {ie {1,2,...,4}|[pi—1,ps] ND AO} 
m,)(Pi—Pi-1) < (2M)(2m)(--—) = 5, and pS (Mi —mi)(pi—Pi-1) < 
{ie {1,2,...,4}|[pi-1,pi]JND=0} 
= 7 it follows that U(f, P) — L(f, P) < €, so f is integrable. 


whose mesh is less than 


Let P = {po,pi,p2,---,Pe} be a partition of 


€ 


\(Se—a) 


(b 


157 


Solution to Exercise 6.4. (a) Let f,g be integrable on [a,b] and let f(x) < g(x) for all 
b b 
x € [a,b]. Then [ f(a ae g(x)dx. 


(b) Let f be integrable on [a,b]. Ifm < f(a) < M for all x € [a,b] then m(b— a) = 


[macs [70 yar < f Mae = b—a). 


Proof. Let {P,,} be a sequence of partitions of [a,b] with {|P,|}— 0, and choose a marking 
T, for each P,,. Then since f(x) < g(x) it follows that Sr, (f, Pn) < Sr,(g, Pn). Since f,g are 


integrable, we have proven that {S7,(f, Pn)} > i f(x)dz and {Sr,(g, Pn)} > i g(x)dx 
b a a 


b 
By the Comparison Theorem, fides gla) da: 
a a 
Let k be any real number and P = {0,%1,..., 2m} bea partition of [a, b]. Then L(f, P 


= Lite i — %j-1) = k(b— a) which means that ( w fr=u oe 
k(b—a). Hence, ifm < f(x) < M for all x € [a, b] then m(b—a) = [mars fH) )dx < 
[Mav =m b—a). 


Solution to Exercise 6.5. Let f : [a,b] > R be continuous and non-negative. Prove that 


if f(c) > 0 for some c € [a,b] then i: f(x)dx > 0. 


Proof. Since f is continuous, f is integrable by Theorem 6.8, and we may choose 0 < 6 so 
that if |v —c| < 6 then |f(x) — f(0)| < — so f(x) > = Then let P be a partition 


of [a,b] with |P| < 6. There is some biel [xj;-1, vj] induced by P which contains c. 


Tins, fF 2 LU, Pets see 


0. 
ee 


Solution to Exercise 6.6. Let f : [a,b] — R be monotone. Then f is integrable. 


Proof. Let «€ > 0. First, assume that f is non-decreasing. Choose a partition P = 
€ 

X0,21,+..,2n} of [a,b] so that |P| < : 

‘cad PIS FO) Fay +i 


infimum and supremum of all f(x) values on the subinterval [7;_1,x;] are f(a;-1) and 


Since f is non-decreasing, the 


f (xi) respectively, for each 1 <i <n. Hence U(f, P)—L(f, P) = S "(Mi — mi) (2i- 21-1) = 
i=1 
€ n 


i=1 


2 Fe) fe aera) a f(®)— fla) 41 Sof (@)-F (@i-1)) = f(®)— fla) + {FO 
( 


1 
a)) <e. Thus, f is integrable. 


158 CHAPTER 6. INTEGRATION 


Solution to Exercise 6.7. Let gi(x), g(x) be differentiable on [a,b], let f(a) be continuous 


2(2x) 
on [a,b] and let F(x) = if f(t)dt. Then F(a) = f(92(x))93(x) — f(g1(@)) 91 (2). 
gilz 


g2(x) 92(x) g(x) 

Proof. Since F(x) = / f@d= i f (t)dt -| f(t)dt, it follows from the 
gi (2) a a 

chain rule and the First Form of the Fundamental Theorem of Calculus that F’(x) = 


f(g2(x))9o(@) — F(91(x))91 (2). 


1 
Solution to Exercise 6.8. Show that lim (1+ —)" =e. 
noo n 
1 le 1 
Proof. We know that In(1 + —)” = nln(1+ —) = nf pt. Since i is decreasing, the 
1 


n 
+1 


1 
largest value of oe [1,1 + —] is 1 and the smallest value is Thus, by Exercise 6.4 
n n 


1 


lon aa . n 1 
we know that ———— < dt < —(1), which means that —— < nln(1+—) <1. 
1 n n+1 n 


nn+1~— 


Hence, by the Squeeze Theorem we know that In(1 + —)” > 1. Since e” is continuous, it 
n 


n 1 
follows that e@(@+n)" = {(l+ 7) else. 


Solution to Exercise 6.9. Let f,g : [a,b] ~ R be integrable functions. Then fg is 
bounded. 


Proof. Since f,g are integrable, they are both bounded functions. Thus, we may choose 
M,N > 0so that |f(ax)| < M and |g(x)| < N. Then |f(x)g(x)| = |f(x)||g(a)| < MN. 


: : et +4 eo et — eo # ; 
Solution to Exercise 6.10. Define a een cosh(x) and Sg, sinh(xz). Then 


(sinh(a))/ = cosh(a) and (cosh(x))! = sinh(z). 


Proof. We just use the linearity of the integral and the chain rule to get (Ay = 


e*—e * an a —e* e*~+te* 
2 2 2 


Solution to Exercise 6.11. Find, with proof, an example of a function f(x) which is 
x 


integrable on [a,b] so that F(x) = i f(t)dt is not differentiable. 


a 


159 


x 


Proof. Let f(x) = 0if0 < # < Llandlet f(x) =1if1 <a < 2and let F(z) =} f(t)dt. We 


0 
know f is integrable because it has only one discontinuity, so f is integrable by the Lebesgue 
1 
F(1)-F Odt 
Characterization of Riemann Integrability. However, lim EUS 2) = lim J, Odt = 


. x1 1-2 eol- 1-2 
F —F(1 1dt -—1 
whereas lim Be) = U1) = lim dul! = lim - 
ait x—-l1l aolt x—-1 eol+a—1l 
at. c=. 


’ 


= 1. Thus, F is not differentiable 


Solution to Exercise 6.12. Prove that if c>0 and a,b € R then: 


a ce =e" m 
( ) 
a 


c 

b) = a—b 

(6) 5 = 

(c) (c*)? a co. 
Proof. (a) We have shown in Theorem 6.25 that In(c%c’) = cln(a)+6In(c) = (a+) In(c) = 
In(c**°). Since In() is one to one, it follows that c%c? = c**?. 

(b) We know from Theorem 6.25 that In(S) = aln(c)—bIn(c) = (b—c) In(c) = In(e?~). 

c 


: ; cC a 
Since In(x) is one to one, we see that > = c* es 
C 


(c) We know from Theorem 6.25 that In((c*)’) = bln(c*) = abln(c) = In(c%’). Since 
In(x) is one to one this implies that (c7)’ = c®. 


Solution to Exercise 6.13. Prove that if {x} is a sequence of positive numbers converging 
to a real number r and c > 0 then {c*"} > c". 


Proof. Since c” is continuous since it is differentiable by Theorem 6.26, by the Sequential 
Characterization of Continuity, {c*"} > c". 


Solution to Exercise 6.14. Let f : [a,b] > R be integrable. Then |f| is integrable, and 
b b 
Lf saydel < f is@lae, 
Proof. We know |f| is integrable by Theorem 6.10 since f is integrable and |x| is continuous. 
b b b 
Since —[f(x)] < fle) < |f(a)|, we know that — [|f(@)|dz < f° fade < | [sla by 


b b 
Theorem 6.4, which means that | f(x)dz| < / | f(x)|dx by Theorem 1.16. 


Chapter 7 


Supplementary Materials for One 
Variable 


7.1 The Natural Numbers 


Definition 42 


A set S C R is inductive if 1 € S and for every k € S it is true that k+1 € S. 


The set of natural numbers N is the intersection of all inductive sets. A set S is 
well-ordered if every non-empty subset of S has a least element. 


One example of an inductive set is R, so we know that inductive sets exist. 


Theorem 7.1. (a) N is an inductive set 
(b) 1 is the least element of N. Furthermore, S = {1}U [2,co) is inductive. 


(c) Ifn>1landnéeN thenn—-1eEN 


Proof. (a) Since 1 is an element of all inductive sets (and there are inductive sets that exist), 
1eEN. If k € N then for every inductive set S it follows that k € S and thuyk+1eéS. 
Hence k + 1 € N and so N is inductive. 


(b) We know that 1 € S = {1} U [2,co), and if x < 2 and x € S then xz = 1 so 
e+1=2¢€S. Ife>2thnz+1>2sox+1 € (2,00) soxr+1¢€S. Thus, S is inductive, 
so N C S and since 1 is the least element of S it follows that N contains no points that 
precede 1. 


(c) Suppose there is a natural number / > 1 so that ]—1¢N. Let W =N \ {I}. Then 
since | > 1 we know that 1 € W. Likewise, for any « € W we know that x + 1 € W since 
e+1eNand2+1¥/1. Hence W is inductive and does not contain /, which is impossible 


160 


7.1. THE NATURAL NUMBERS 161 


since N only contains numbers which are elements of every inductive set. We conclude that 
ifn >landneNthenn—-1eEN. 


Theorem 7.2. Principle of Mathematical Induction. For each n € N, let P(n) be a 
statement so that (a) P(1) is true, and (b) whenever P(k) is true for a natural number 
k, it follows that P(k +1) is also true. Then it follows that P(n) is true for alln EN. 


Proof. Let S = {n € N|P(n) is true }. Then by (a) we know that 1 € S and by (b) we know 
that if k € S then k+1€S. Therefore S is inductive, so N C S, which means that P(n) 
is true for every n € N. 


Theorem 7.3. (a) Ifn € N then there are no elements of N between n andn+1 or between 
n—-1andn. 


(b) The natural numbers are well-ordered. 


Proof. (a) Let P(n) be the statement that there are no natural numbers between j and 
j +1 for all natural numbers 7 < n. Since {1} U [2,00) is inductive by Theorem 7.1 we 
know that this set contains N which means that there are no natural numbers between 1 
and 2, so P(1) is true. Assume that there are no natural numbers between 7 and i + 1 for 
alli < k—1 for some k € N. Then let S =NN([1,k] U[k+1,00)). Ifa e NN [1,k - 1] 
then x +1<k andx+1€N since N is inductive, which means 7 + 1 € S. Since there 
are no natural numbers between k — 1 and k, for any x € NN [I1,k], we know that either 
x € [1,k—1] in which case x+1 € S, or x =k in which casek-—14+1=kES,soxr+1€S. 
Also, if ¢ > k then 2 +1 € N (since N is inductive) and x + 1 € [k +1,c0), which means 
x+1¢€S. Thus, S is inductive so N C S, which means that there are no natural numbers 
between k and k+1. By induction, it follows that for each n € N, there are no elements of 
N between n and n+ 1. 

We know there are no elements of N in (0,1) (since 1 is the least natural number), and 
ifm > 1 and m € N then m—1 €N by Theorem 7.1, so there are no elements of N in 
(m—1,m). Hence, for each natural number n it is true that there are no natural numbers 
between n — 1 and n. 

(b) For each natural number n, let P(n) be the statement: If S is a subset of the natural 
numbers that intersects the set of all natural numbers less than or equal to n then S has 
a least element. We note that P(1) is true since if S contains 1 then 1 is the least element 
of S since 1 is the least natural number. Next, assume P(k) is true for k € N. Let S be 
a subset of N which intersects the set of natural numbers less than or equal to +1. If 
S contains a natural number less than or equal to k then S has a least element by the 
induction hypothesis. If S does not contain any natural numbers less than or equal to k 
then by (a) we know that there are no natural numbers between & and k + 1 which means 
that the least element of S is k +1. The result follows for all n € N by induction. Since 
any non-empty subset S of the natural numbers must be a subset of the natural numbers 


162 CHAPTER 7. SUPPLEMENTARY MATERIALS FOR ONE VARIABLE 


that intersects the set of all natural numbers less than or equal to one of the elements of S, 
it follows that every non-empty subset of N has a least element, so S' is well ordered. 


Definition 43 


We will let {1, 2,3, ...,n} be notation to denote NN[1,n] for any natural number n. 


The n-tuple (#1, £2, 23, ...%) is an ordered set of n real numbers which is a function 
f : {1,2,3,...,.2} — R, where f(i) = a; for each i € {1,2,3,..., n}. 


Theorem 7.4. Recursive Definition. 

For each natural number n we let Py(21, 22, +++; 2n) be a statement about n real numbers 
listed in order and satisfying the following criteria: 

(1) Pi(«) is true for exactly one point rx = 21 ER. 

(2) If n is a natural number greater than one then for any 21, 22,23,...,%n—1 80 that 
PtP is 5 By) 1S ue there is exacily one wy, 0 that P,(6iy05 ty 150m) 1 ue. 

Then for each n € N there is a unique function fy, : {1,2,3,...,.n} > R so that, if we 
define x; = fn(t) for eachi € {1,2,3,...,n}, then P,(x1,...,2n) is true, 

and fn(i) = fm(t) for all natural numbers i <m <n. Furthermore, there is a unique 
function f :N—>R so that P,(21, £2, ...,%n) is true for eachn EN. 


Proof. We proceed by induction to show that for each natural number n there is a unique 
fn: {1,2,...,n} 4 R so that P,(21,..., Un) is true and fn(i) = fm(z) for all natural numbers 
i<m<n. This is true ifn = 1 by (1). Assume this is true ifn = k. Then there is a unique 
fe : {1,2,3,...,4} > R so that P, (x1, 22, v3,..., 2%) is true and f,(i) = f;(¢) for all natural 
numbers 7 < j < k. By (2) there is a unique x44, so that Pr41(21,..., 2,241) is true 
by hypothesis, so there is only one choice for f,1; meeting the requirements that f;,41(7) 
must equal f;,(i) for all natural numbers i < k and Py41(x1,...,; 2,41) is true, namely 
frai(kK+1) = p41 and fp41(¢) = fei) for all natural numbers i < k. Thus, fx+1(¢) = fj (z) 
for all natural numbers 7 < j <k+1 since fj(i) = fx() for i < k. 

If we define f : N— R by f(n) = f,(n) for each n € N then by the argument above we 
know that P,,(x1,...,%n) is true for all n € N. Since (x1, 22,...,2n) is the unique n-tuple so 
that P;(x1,...,2;) is true for all 1 <7 <n, it follows that the choice of f is unique. 


The process of generating a sequence by recursive definition is the process described 
here. Generally, we start with one or more numbers at the beginning of such a sequence 
and then assign a rule to define the next number in the sequence based on its predecessors. 
Initially, when talking about N, we would have liked to simply say that N = {1,2,3,4,...} 
which was meaningless because it depended on an undefined notion of dots at the end. 
Now, we can instead say that we would like to define statement P(1) to be x, = 1, which 
is true if any only if x; = 1. Then we can say that if n © N and n > 1 then we define 


7.2. FUNDAMENTAL THEOREM OF ARITHMETIC 163 


P(n) to be the statement that x, = %,-1 +1. If we have already established the n-tuple 
(%1,%2,..-)%n—1) so that P;(x1,...,2;) is true for all 1 < n —1 then there is only one zy, 
satisfying P,. In other words, once we know rp_1, the assignment ty, = %yn_1 +1 is unique. 
Hence, we have established that the function f : N — R induced by these statements is 
unique. We can inductively argue that f(n) = n for each natural number n under this 
mapping. First, f(1) = 1, and if f(k) = k then by definition f(k +1) = f(k) +1=k+4+1. 
This means that the set defined by the rule “start at one, and then for each element of 
the set the element obtained by adding one to that element is the next element of the set 
and continue indefinitely” is the set of natural numbers. This is what is actually meant by 
N = {1,2,3,...} since the “...” is intended to mean “continue with the rule that the next 
element of the set is the preceding element plus one.” Thus, now that we have proven the 
preceding theorem, it is reasonable and correct for us to write N = {1,2,3,...} if the rule 
choosing the unique next element in the list is understood and is clearly unique based on its 
predecessors and a unique starting point is included, and the statement now has meaning. 
Recall, though, that the domain of the function f used to define the natural numbers in 
this way has a domain which is the natural numbers. In other words, we had to establish 
the properties of the natural numbers before we could use this description of the natural 
numbers. 

It should also be understood that the statement may vary depending on a choice for 
recursive definition, in which case the sequence is unique based on those choices, but the 
choices are usually unknown so which sequence is determined by those choices is not known 
(there would be different possibilities for different choices). For example, you could say pick 
x1 € R to be the first member of a sequence, let x2 be a number more than x; and then 
choose a number x3 > #2 and so on. After the choice of x; is made (but not before) the 
statement that ”x, is the point that was chosen” is satisfied by only one real number. The 
sequence chosen is an increasing sequence and exists. The specific property each choice 
satisfies based on the preceding statements is that the next choice is chosen from among 
those points that are greater than the preceding choice. There were many choices possible, 
and the choice was not unique (but the point chosen became unique once the choice was 
made, and was the unique point satisfying the statement that the point was the one which 
was selected). Nevertheless, that is a legitimate way of creating a sequence (even though its 
members are not specified if the choices were never stated) to make choices among options 
at each stage for the next element in the sequence. 


7.2 Fundamental Theorem of Arithmetic 


Theorem 7.5. Euclid’s Division Lemma. Let a € Z and letb€ N. Then there are unique 
integers q andr so thata=bq+r and0<r<b. 


Proof. Let S = {a—nb € NU {0}|n € Z}. Then S is non-empty since by Theorem 2.6, 
we know that there is a natural number m > = so a—(—m)b > 0. Hence, since N is 
well-ordered, SN has a smallest element, which means that S has a smallest element 
(which is zero if 0 € S and the smallest element of SMN otherwise). We write this element 
as a— qb = r for some gq € Z. Since a — gb is the least element of S it must follow that 
a—qb < bor else a— (q+ 1)b > b—b=0, so a— (q+ 1)b € S, contradicting a — gb being 
the least element of S. Since 0 < a— qb < 6 we know that a = bg+rand0<r<0b. 


164 CHAPTER 7. SUPPLEMENTARY MATERIALS FOR ONE VARIABLE 


To see that the choice of g and r is unique, first note that for any integers Q, R so that 
a= 6Q+R it must follow that R = a—bQ, so R is uniquely determined by Q. If Q > q then 
R=a-—bQ < a—(q4+1)b=r—b < 0, andifQ < q then R = a—bQ > a—(q-1)b=r+b> b. 
Hence, there is only one possible choice of g which makes a = bg+rand0<r< bd fora 
uniquely associated integer r. 


Definition 44 


Let a € Z and let b € N. Let g and r be the unique integers so that a = bg +r 


and 0 <r <b. Then we say that q is the quotient when a is divided by 6 and that r 
is the remainder. 


Theorem 7.6. Bezout’s Identity. Let x and y be integers. Then the smallest positive 
integer of the form ix + jy, where 1,7 € Z, is the greatest common divisor of x and y. 
Furthermore, the integers of the form ix+jy, where i,j € Z, are the integer multiples of d. 


Proof. Let S = {mx +ny € N|m,n € Z}. Then S is non-empty (either x + y € N or 
—x —y €N) so since N is well-ordered we know that S has a least element d= sx + ty for 
some integers s,t. Using Euclid’s Division Lemma, we know that there are unique inters 
q,r so that x = qd+rand0<r<d. This means that r= x2-qd=2-q(sx+ty) = 
(1 — qs)x — qty € SU {0}. However, since r < d and d is the smallest element of S we 
know that r = 0. From this we conclude that + = qd. We can similarly show that for some 
integer w it is true that y = wd, which means that d is a divisor of x and y. If cis any other 
positive divisor of x and y then there are integers cz, cy to that x = crc and y = cyc. That 
means that d = sczc+tcyc = c(sc, +tcy), which means that sc, +tc, > 1 (since multiplying 
a negative number or zero by a positive integer cannot result in a positive integer). Thus 
c(scz + tey) > c(1), so d > c, meaning that d is the greatest common divisor. 

Finally, for any integer of the form iz+jy, we know that ix+jy = igd+jwd = d(iq+jw) 
is a multiple of d. Likewise, any multiple of d by an integer k is equal to kd = k(sx + ty) = 
(ks)x + (kt)y, which is an integer of the form ix + jy. 


Theorem 7.7. Euclid’s Lemma. Let a,b be integers and let p € N so that the greatest 
common divisor of p and a is one and plab. Then p\b. 


Proof. By Bezout’s Identity, we know that we can find integers s,t so that sp+ta = 1 which 
means that bsp + bta = b. Since plab we can find an integer m so that pm = ab. Hence, 
psb + pmt = b so p(sb + tm) = b, which means that p divides b. 


Theorem 7.8. Corollary to Euclid’s Lemma. Let p be a prime number which divides q*, 
where q is a prime number and k is a positive integer. Then p = q. Furthermore, the greatest 
common divisor of one prime and another prime taken to a positive power is always one. 


7.2. FUNDAMENTAL THEOREM OF ARITHMETIC 165 


Proof. Suppose p # q. Then p does not divide q since q is prime, so the greatest common 
divisor of p and q is one. Hence, if rd then p|q by Euclid’s Lemma, which is impossible, 
which means that the greatest common divisor of p and q? is one. Proceeding inductively, if 
we have shown that the greatest common divisor of g” and p is one for some positive integer 
r then if p|q’*! then plq"q, so it follows from Euclid’s Lemma that p|q which it cannot. 
Thus, p does not divide q"*! and therefore the greatest common divisor of p and q’*! is 


one. Thus p cannot divide any positive power of g and has greatest common divisor of one 


with every positive power of q. 


Definition 45 


Let {p1, p2,---; De} be a set of prime numbers and {m1,mo,...,.m~} CN. Ifn = 


P| Py 7p3°-..P, © then we refer to p{"p>'*p5"*...p,* as a prime factorization or prime 
decomposition for n (or of n). 


In this definition we said ”a” prime factorization, but we are about to prove there is 
only one prime decomposition for any positive integer. 


Theorem 7.9. The Fundamental Theorem of Arithmetic. Letn > 1 wheren € N. Then 
there is a unique set of prime numbers {p1,p2,...,Pe} with corresponding positive integer 
powers {m1,mMg,...,m,} so that n = py" ps" ps"*...p,*. 

Proof. We first show there is such a set of powers of prime numbers. We proceed by strong 
induction on n. First, if n = 2 then n can be represented as n = 2'. Assume that we have 
shown there is a representation of n as a product of prime numbers for alln < k. Ifk+1 
is prime then k +1 = (k+1)!. If k +1 is composite then k +1 = st, where s,t are positive 
integers greater than one and thus also less than k + 1 (since a number greater than one 
times a number greater than or equal to k + 1 is greater than k + 1). By the induction 
hypothesis, since 1 <s<kand1<t<k there is a representation of s,t as a product of 
prime numbers raised to positive integer powers and we can write s = py"'p}"*p3"3...py", 
t= fg ...g, where each p; and q; is prime and each m; and j; is a positive integer. 
Thus, & +1 = st = pl] pi" p5°...pn gg 3... is a product of primes raised to positive 
powers. 

To show uniqueness, suppose S = {n € N|n has two different prime factorizations } is 
non-empty. Then since N is well ordered, there is a least integer n € S. Then there are 
factorizations n = pi"'ph?p3"...pe" and n = qj14q37q3°...q3° for n (where each p; and q; is 
prime and each m,; and j; is a positive integer). Since pi|n, by Euclid’s Lemma we know 
that p; divides some qj* and therefore p; = g, for some positive integer k by the Corollary 


n 
to Euclid’s Lemma (since p,q, are prime). Thus, — is an integer less than n which can 
Pi 


n _ 
be represented with two different prime factorizations, namely — = py" 'ph? ee ae 
Pi 


qi 2? ge ah aan This contradicts the assumption that n was the smallest element of 
S. We conclude that there is exactly one prime factorization of any natural number greater 


than one. 


166 CHAPTER 7. SUPPLEMENTARY MATERIALS FOR ONE VARIABLE 


Theorem 7.10. Let EQ, whereme Z andneéeN. Then ee for some p € Z and 
n n q 


qd EN, where the greatest common divisor of p and q is one. 
m 
Proof. Since N is well ordered, there is a smallest natural number q so that ep for 
q n 


some integer p. Given q, p is uniquely determined, since p = a If kis an integer greater 
7 sk 
tk 
which means that t < q and it is possible to write - = 7 contradicting our choice of q. 


than one which is a common divisor of g and p then we can write z as for integers s, t, 
qd 


Hence, the greatest common divisor of p and q is one. 


Definition 46 


Let ? € Q for some p € Z and q E N. We say P is written in reduced terms if p 


qd q 
and q are chosen so that the greatest common divisor of p and q is one. 


Theorem 7.11. The square root of a prime number p is irrational. 

Proof. We know that the square root of every prime number exists by Exercise 4.12. Suppose 

/p = ™ for some m € Z and n € N written in reduced terms. Then n2p = m? which means 
n 


that p is a divisor of m? and is therefore in the prime decomposition of m?, which is only 
possible if p is in the prime decomposition of m by the Fundamental Theorem of Arithmetic, 


2 
m 
which means that m? is divisible by p?. But this means —- = n? is divisible by p, so n 


m 
is divisible by p, contradicting the assumption that — was written in reduced terms. We 
n 


conclude that ,/p is irrational. 


7.3. DECIMALS 167 


7.3. Decimals 


Definition 47 


dy d 
ae +...} + r, where dg € Nand0< d; <9 


dy 
If th 
the sequence {do, do + 19° + 10 * i00 


for each 7 € N, then we say that do.djdgd3... is a decimal expansion for r and that 


r = do.d,d2d3.... We will refer to a sequence {do, do 4 “do | 2 + a ...}, where 


10 
do € NU {0} and 0 < d; < 9 for each i € N as a positive decimal sequence (even 


though it might consist entirely of zero entries and not actually converge to a positive 


number). We refer to do.did2...dn = do, do 4 “do db = di 


terminating decimal expansion and note that this is the same as do.d;d3...d,,000... 
(since this is a constant sequence after the nth term and thus converges). We also 


d d d 
refer to {—do, —do = do z z ,--} as a negative decimal sequence. If 
dy dy do 


10’ 10 100 
{—do, —do 02 do TG aie ...} + r then we say that r = —do.didgd3..., and 
—dp.d,d2d3... is a decimal expansion for r. 


We will refer to the primary decimal representation of a non-negative real number r 
as do.d,dod3... if dy is the greatest non-negative integer which is less than or equal 


to r, d; is the largest non-negative integer so that | is less than or equal to r — do 


(so 0 < d, < 9 since r — dp < 1), and dz is the largest non-negative integer so that 


160 less than or equal to r — dy — 0 (so 0 < dz < 9), and so on, in which case 


{do, do.d1, do.didz,...} is the primary decimal sequence for r. 
If r is a negative real number and the primary decimal representation for —r is 
do.d,dod3..., then we write the primary decimal representation of —r as —do.d,dod3..., 

dy 


i ne ...} is the primary 


and the negative decimal sequence {—dp, —do 


decimal sequence for —r. 


Theorem 7.12. If do.dijdod3... is the primary decimal representation for a non-negative 
real number r then r = dpo.d\dod3..... Likewise, if —do.didgd3... is the primary decimal 
representation for a negative number —r then —r = —dpo.d,dod3... 


Proof. We first observe inductively that, for all positive integers n, it is true that r — 


1 
do.djdg...dn < i108" This is true for n = 1 by definition. Assuming this is true for n = k, so 


1 
r — do.didy...dy < 108" Recall that d;,41 is chosen to be the largest non-negative integer so 


dey +1 
“a < 0, which 


d 
that r—dp.dydp...d, — “t+ > 0. It must follow that r —do.d,do...dz — 


10k+1 = 
p41 1 1 
Tiel < Tae We conclude that r — do.djdo...dn < Ton 


for each n € N. By Exercise 3.13, we know that lagal — 0, so since r — 108 <r- 


means that r — do.d,do...dx 


1 
do.dyd...dn <7r+ i0”’ by the Squeeze Theorem we see that the primary decimal sequence 


168 CHAPTER 7. SUPPLEMENTARY MATERIALS FOR ONE VARIABLE 


dy 4, _ 
{do, do + 10° ro 100 + ne +7, = = ee 
j sacra eek {do, do + Ti 7 a + ...} + r, it follows that {—do, —do — 
1 1 2 
eh — ll. 
io eo 


It is not always true that a decimal expansion for a number is unique. For instance, 
0.99999... = 1.00000. To verify this, just note that 0.9999...9 with n entries of p is the same 
1 


jour! — 0 by Exercise 3.13. Likewise, 


1- ious — 1 since we know { 


1 1 
{—(1 ) es , which means that 0.000...0999... = 0.000...1, where the first k 
10* 19"+1 10" 


entries on the left hand side are zero after the decimal point and the kth entry after the 
decimal point on the right is 1 (and the other entries are zero). 


Note that all the theorems proven about N in Chapter 2 follow without the completeness 
axiom, apart from the theorem that N is not bounded above (which need not be true if the 
the least upper bound property is false). This is important because some decimal properties 
are established using induction, and hold whether or not the completeness axiom is assumed. 


Theorem 7.13. Let r = do.did3... be the limit of a positive decimal sequence {do,do + 


d d d 1 
0 “ae 7 ont .} in an ordered field S. Then do.didgds...dy << do.didy...dn + > 
for each natural number n. Furthermore, if —r = —do.d,dod3... is the limit of a negative 
: dy dy dy 1 
I .} then —do.didod3...d, — —~- <r< 
decimal sequence {—do, —do 16° do 0° 100° \ then —do.dd2d3 ion r< 


—do.didzd3...dn. 


Proof. By the Comparison Theorem (which did not require the completeness axiom to 


d 
prove), we know that since all terms of the positive decimal sequence {do, do + a0} do + 


d d 
0 ewes ...} after the nth term are at least as large as do.did2d3...dy, it follows that r > 


do.dydgd3...dy. Similarly, for any natural number k, we notice that do.d,ded3...dndn41..-dntk = 


1 1 
do.d,dod3...dn,+ Jon O-dntidn+2---dn+k) < do.dyd2d3...dyn+ Ion (0-999...) = do.d\dod3...dpn,+ 


1 

10 Thus, by the Comparison Theorem again, we get that do.d,dod3...dy <1 < do.dydg...dy+ 
1 1 

Tis From this it also follows that —dp.d,dod3...dy 10° < —r < —do.d,dgd3...dn. 


The Monotone Convergence Theorem is used in part of the proof below. Recall that the 
completeness axiom was necessary to prove that theorem. 


Theorem 7.14. Let F' be an ordered field. Then every set in F which is bounded above has 
a least upper bound if and only if the set N of natural numbers is not bounded above and 
every decimal sequence converges. 


7.3. DECIMALS 169 


Proof. First, assume that all decimal sequences converge. Let S be a non-empty subset of 
F which is bounded above. For now, assume that $M [0,00) #0. Since N is not bounded 
above, there is a first natural number dp + 1 which exceeds all elements of S since N is well 
ordered, which means that do is the largest integer which does not exceed all elements of S. 
Since dg + 1 does exceed all elements of S, there is a largest integer d; so that 0 < dj < 9 so 


d 
that do + i does not exceed all elements of S. If we have chosen d; for all i < k € N so that 
1 
do.d,dy...dy, does not exceed all elements of S and for each 7 it is true that do.d,d2..d; + To 
1 
does exceed all elements of S, then we know do.did3...d, + —{ does exceed all elements of 


S, so we choose 0 < dgii < 9 to be the largest integer so that do.didg...dydx41 does not 

exceed all elements of S. This lets us inductively choose d, for all n € N. We claim that 
r = do.dgdg... is the least upper bound for S. 

To see that r is an upper bound for S$, suppose there is some s € S so that s > r. Then 

1 1 

s—r> T0F for some positive integer k by Exercise 3.13. But we know that do.did...dn+ 70% 

1 
exceeds all elements of S so dop.dydo...d, <r <8 < do.djdg...dy + Tor’ so this is impossible. 
To see that r is the least upper bound for S, let u <r. Then r—u > 0 and we can, again, 


choose k so that i0F <r—u. We know that do.d,d)...d, does not exceed all elements of S 


1 
and r — do.dido...dy < ioe by Theorem 7.13, so there is some t € S so that t > do.dydo...dx, 


and r—t< —, <r-—u, sot > u, which means that u is not an upper bound for S, and 


therefore r is the least upper bound of S. 

If S contains only negative numbers, then S contains a negative number & and so 
(S —k)N[0,0co) 4 @ and thus S — k has a least upper bound r. For any x € S, it follows 
that r—k<r,sox<r+k,sor-+k is an upper bound for S. Ifu<r+kthenu—k <r, 
so u—& is not an upper bound for S — k, and there is some x —k > u—k for some x € S. 
This means that x > u, so u is not an upper bound for S. Hence, sup(S) = r+ k. 

Next, assume the completeness axiom. Then let {do,do.d1, do.did2,...} be a positive 
decimal sequence. The sequence is non-decreasing since each term is the preceding term 
plus a non-negative number. By Theorem 7.13, we know that each do.djdo...dy, < do + 1, 
which means that this decimal sequence is a non-decreasing sequence which is bounded 
above and therefore converges to a real number r by the Monotone Convergence Theorem. 


d 
Similarly, for any negative decimal sequence {—dg, —do— a ...} we know the corresponding 


positive decimal sequence {do, do.d1, do, di da, ...} converges to some number r, so the negative 
decimal sequence converges to —r. Hence, assuming the completeness axiom we know that 
all decimal sequences converge to real numbers. 

The fact that the natural numbers is not bounded above also follows from the completeness 
axiom, and was proven in Theorem 2.6. 


We conclude with a theorem that will make the decimal-based argument that the real 
numbers are uncountable that we will discuss in the cardinality portion of the chapter 
rigorous. 


170 CHAPTER 7. SUPPLEMENTARY MATERIALS FOR ONE VARIABLE 


Theorem 7.15. Let r = ro.ryror3..., § = $9.818283... be any non-negative real numbers so 
that for some non-negative integer 1 it is the case that r; 4 s;, and m is the first non- 
negative integer so that fm # Sm. If tm < Sm then r < 8 unless 8% = fm +1, in which 
case r = 8 if and only if r; = 9 for alli > m and s; = 0 for alli > m. In particular, if 
1 < |r; — s;| < 9 for some positive integer i thenr # s. 


Proof. Without loss of generality we may assume that rm < 8m < 9 as described above. 


1 
We know that ro.r17r2...'m < To-T1T2-Tm + <A S $0-818283-.-8m < 8 by Theorem 7.13. 


Additionally, by the Comparison Theorem, we note that if any s; > 0 for 7 > m then 
$0-818253-..8m < 89.818283...8; < 8, sor < s. Likewise, if r; < 9 for any 7 > m then 


r<7o.r1r2..73 + =Totiretpcalep  L) Ze Tie 099-Ge+ L) = rerireta 
1 1 


io | loi? 


109 
< 50.81 8253...8m <8. Hence, it is only possible for r and s to be equal if 


1 
s; = 0 for 7 > mand r; = 9 for all j > m, in which case r = r9.17119..-7m + Tom (0-999...) = 


TO-T1T2-.Tm + = ro-T112---(Tm + 1) (since we know rm < 8). 
Thus, if it is true that if 1 < |r; — s;| < 9 for some positive integer 7, then if 7 = m then 


r#s since $m Arm+1, and ifi > mthenr ¥ s since r;-— 5; 9-0 =9. Hence, r F s. 


7.4. CARDINALITY 171 


7.4 Cardinality 


Since we have access to decimals now, we can use an argument that is perhaps easier to see 
for the uncountability of the real numbers. 


Theorem 7.16. The open interval (0,1) is uncountable, as is R. 


Proof. Suppose (0,1) is countable. Then there is a one to one and onto mapping f : N > 
(0,1) defined as follows: 


f(1) = 0.a11412413... 

f (2) = 0.a21022493... 

f(3) = 0.a31432433... 

and so forth. For each n € N we choose zn, = 7 if dnn <5 and zy, = 3 if Gnn > 5. Since 
the nth digit of the number z = .21 2223... differs from the nth digit of f(n) by a number 
more than two and less than nine, it follows that z 4 f(n) for any n € N by Theorem 7.15, 
a contradiction to f being onto. 

The real numbers contain (0,1) (and a similar argument could be performed starting 
with the real numbers themselves) so R is also uncountable. 


Theorem 7.17. Let S CT, a finite set. Then S is finite and |S| < |T}. 


Proof. Since T is finite we can find a natural number n and a one to one and onto function 
f : {1,2,3,...,.n} > T. If S is empty then S is finite. Otherwise, let s € S. If we define 
F(t) = f(z) if f() € S and define F(i) = s otherwise then F': {1,2,3,...,n} — S is onto. 
By the well-ordering of the natural numbers there is a least natural number m < n so that 
there is an onto mapping from g: {1,2,3,...,m}— S. To see that g is one to one, suppose 
that g(s) = g(t) for some s < t. Then we can define a function h : {1, 2,3, ...,m—1} > S by 
setting h(7) = g(t) if¢ < t and h(i) = g(i+1) ift <i < m-—1. For each x € S there is some 
1<w<mso that g(w) = 2, so if w <t then h(w) = « and if w > ¢ then h(w — 1) = 2, 
and if w =t then h(s) = x. Hence, h is onto, contradicting the choice of m. It follows that 
g is one to one and onto, so |S| =m < t. 


Theorem 7.18. Let S be a nonempty set. Then S' is finite if and only if there is a positive 
integer n and a function f : {1,2,3,...,n} > S which is onto. 


Proof. If S is finite then such a function (which is both one to one and onto) exists by 
definition. Assume there is a function f : {1,2,3,...,2} — S which onto. Then there is a 
first integer m so that there is an onto function g : {1,2,..,m}— S. Suppose g is not one 
to one. Then there are integers i < j so that g(i) = g(j), in which case we can define a 
function G : {1,2,...,m—1}— S which is onto by setting G(k) = g(k) if k < 7 and setting 
G(k) = g(k+1) if k > 7. This contradicts our choice of m, so it follows that g is one to one 
and onto, and therefore |S| =m, which means that S is finite. 


172 CHAPTER 7. SUPPLEMENTARY MATERIALS FOR ONE VARIABLE 


Theorem 7.19. A non-empty set S is countable if and only if there is an onto function 
f:N-oS. 


Proof. We know that if S is countable then either S is countably infinite (in which case 
there is a one to one and onto function f : N > S' by definition), or S' is finite, in which 
case there is a function g : {1,2,3,...,2} — S which is onto (for some natural number n), 
and so if we then define f(i) = g(1) for i > n and f(z) = g(t) for natural numbers i < n 
then we have a function f : N > S which is onto. 

Next, assume there is a function f : N > S be onto. First, assume there is an integer 
n so that f({1,2,...,n}) = S. In this case, we will show that S is finite. By well ordering, 
there is a first integer m so that there is an onto function g : {1,2,..,m}— S. Suppose g 
is not one to one. Then there are integers 7 < j so that g(i) = g(j), in which case we can 
define a function G : {1,2,...,m— 1} — S which is onto by setting G(k) = g(k) if k < j 
and setting G(k) = g(k +1) if k > 7. This contradicts our choice of m, so it follows that g 
is one to one and onto, and therefore |.S| = m. 

Next, assume that there is no integer n so that f({1,2,...,n}) = S. In this case, we will 
show that S is countably infinite. Let h(1) = f(1). If A(z) has been chosen for all i < k then 
pick h(k +1) = f(z), where z is the first integer such that f(z) ¢ {h(1), h(2), h(3), ...A(k)}. 
Note that such a z will always exist since otherwise setting t = max{n € N|f(n) = A(t) for 
some i < k}, it would follow that f({1,2,...,t}) = S$. Thus, h: N— S. We know that h is 
one to one because each choice of h(j) was chosen to be different from h(i) for i < 7. We 
know that h is onto because given any x € S there is an s so that f(s) = x and therefore 
either h(s) = x or x € {h(1),h(2),...,h(s — 1)} by construction. Hence, S is countably 
infinite. 


Theorem 7.20. Jf |A| =n and |A| =m thenn=m. 


Proof. We proceed by induction on n. If n = 1 then there is a one to one and onto function 
h: {1} — A. Thus, the definition of a function implies A = {h(1)}. Let m > 1 and let 
r: {1,2,...,m}— A. Since each r(z) = h(1) (the only element of A) it must follow that r 
is not one to one. Hence, it is false that |A| =m. 

Assume the statement of the theorem is true whenever n <k EN. Let f: {1,2,..,44+ 
1} — A be a one to one and onto function, and let g : {1,2,...,.m}— A be a one to one 
and onto function. Then defining F'(i) = f(z) for all i < k defines a one to one and onto 
function F’: {1,2,...,k} > A\{f(k+1)}. For some j we know that g(j) = f(k +1), so we 
define a function G': {1,2,..,m—1} > A\ {f(k+1)} by setting G(z) = g(t) if A 7 and 
setting G(j) = g(m) if 7 4m. If 7 = m then we just set G(i) = g(t) for all 1 <i< m-—1. 
Then F' and G are one to one and onto functions and by the induction hypothesis, it follows 
that m —1=k, and therefore m =k +1. By induction, the result follows. 


Theorem 7.21. Let A and B be sets. Then there is an onto function f : A > B if and 
only if there is a one to one functiong: B— A. 


7.4. CARDINALITY 173 


Proof. Assume there is an onto function f : A— B. For each b € B we can choose a point 
ay € f‘(b), and set g(b) = ay. Then if ay = a, it must follow that b = c because f is a 
function. Hence, g is one to one. 

Next, assume there is a one to one function g: B > A. For each a € ran(qg), define 
f(a) to be the point b such that g(b) = a (since g is one to one there is only one such choice 
of b). Choose an element w € B. If a € A\ ran(g) then define f(a) = w. Then f is onto 
since f(ran(g)) = B. 


Theorem 7.22. N x N is countably infinite. 


Proof. Define f : N x N > N by setting f(i, 7) = 23’ for all i,7 € N. By the Fundamental 
Theorem of Arithmetic, if i 4 s or j £ t for s,t € N then 2°3/ 4 2°3', which means that 
f is one to one, and therefore by Theorem 7.21 and Theorem 7.19 it follows that N x N is 
countable. 


CO 
Theorem 7.23. For eachn EN, let Ay, be a countable set. Then S = U A, is countable. 


m=1 
In other words, the union of a countable collection of countable sets is countable. 


Proof. If each A, is empty then the result follows. Assume that at least A, is non-empty. 
For each natural number i, if A; 4 @ then by Theorem 7.19, we can define an onto function 
gi: N—- A;. By Theorem 7.22, we can find a function h : N > N x N which is one to one 
and onto. We then define f : Nx N > S by setting f(7,7) = gi(j) if A; 4 @ and define 
f(i,7) = 91(1) otherwise. Then f is onto since for each xz € S we know that x € A; for some 
i €N, so gi(7) = a for some 7 € N, which means that f(i,7) =a. Hence, foh:N—> Sisa 
composition of onto functions which is onto, so S' is countable by Theorem 7.19. 


Notice that there is no requirement that the sets A; be non-empty in the preceding 
proof, so in the case where only finitely many of the A; sets are non-empty the union is 
just the union of the finitely many sets A; which are non-empty. Hence, the finite union of 
countable sets is also countable. 


We have already proven the following theorem, but the proof involved a process of 
choosing that might not be comfortable for some readers. Here is another proof that the 
rational numbers are countable using the theorems in this section. 


Theorem 7.24. The set Q of rational numbers is countable. 


Proof. For each n € N, we let S, = fe € Q|m € Z}. Each S,, is countable. One way to 
n 


observe this is to note that since Z is countable there is a function g : N + Z which is one to 


i 
one and onto, so defining G,(7) = 10) we have G, : N — S;, which is one to one and onto, 
n 


(oe) 


so S, is countable. Since Q = ) S», it follows from Theorem 7.23 that Q is countable. 


WS 


174 CHAPTER 7. SUPPLEMENTARY MATERIALS FOR ONE VARIABLE 


Theorem 7.25. Let S,T be disjoint finite sets. Then |S UT| =|S|+|T|. Furthermore, if 
{$1,S2,..., Sm} is a pairwise disjoint set of finite sets with |S;| =n; for alll <i<m then 
m 


| U Sil = Soni. 
i=1 i=1 


Proof. Let |S| =n and |T| = m. Define h : {1,2,3,...,.n} ~ {m+1,m+4+2,...,,.m+n} = 
[m+ 1,m+n] ON by h(t) =i+m. Then h is one to one since if i < 7 < n then 
h(t) =t+n<j+n=A(j), and h is onto since if j is a positive integer in [m+ 1,m+n] 
then 7 — n < _m which means j — n+ n = m is in the range of h. 

Since |S| = n we can find gj : {1,2,3,...,r} + S so that g; is one to one and onto. Since 
|T| = m we can also find go : {1,2,3,...,m}— T so that gz is one to one and onto. We then 
define f: SUT > {1,2,3,...,m+n} by setting f(x) = gi(x) if x € T, and f(x) = h(go(x)) 
if « € S. Then f is a well defined function since ST 4 (. We know that f is one to one 
since ifx # yin SUT thenifx € Sandy € T then f(x) > mand f(y) < m. Ifz,y € S then 
gi(x) # gi(y) since g; is one to one, which means that f(x) = h(gi(x)) 4 h(gi(y)) = fly) 
since h is one to one. If x,y € T then f(x) = g(x) 4 ga(y) = f(y) since gz is one to one. 

Furthermore, f is onto since for every positive integer i < m there is some x € T' so that 
g2(x) =i = f(x) since go is onto, and for every positive integer m+1<i< m+n there is 
some x € S so that gi(x) =i—m and hence f(x) =i —m-+m =i since gj is onto. 

Since f is one to one and onto, it follows that |S UT| = |.S|+ |T). 

We extend this to a finite pairwise disjoint collection {51, S2,..., Sm} of finite sets with 
|S;| = n; for all 1 < i < m inductively. It is certainly true that ifm = 1 then |S;| = nj. 


We assume that whenever there are k many pairwise disjoint finite sets 51, S2,...,5, with 
k k 


|S;| = n; for all 1 <i < k then | U S|= Sone We take a collection {51, S2,..., Sz, Sk41} 


i=l i=l 
of pairwise disjoint finite sets with |S;| =n; for all 1 <7 <k+1 and note by the induction 
k 


k k 
hypothesis that | U 5 = De Hence, since (U Si) A Sp41 = 0, by the earlier part of 
i=1 i=1 i=1 
k+1 k k4+1 
this theorem we know that | U S| =| U Si) + | Seaa| = S- n;. The result then follows by 
i=1 i=1 i=1 
induction. 


The following is used often in combinatorics. 


Theorem 7.26. Pigeonhole Principle. Let {S1,S2,...,Sm} be a set of finite sets with |S;| = 
m m m m 


n, for all 1 <i<m. Then | U hohe De he In other words, if | U S;| > au then for at 
i=l i=l i=l i=l 
least one i it is true that |S;| > n;. 


n-1 
Proof. We define T, = S; and inductively define T, = Si, \ U S; for 1 <n<m. Then the 
i=1 


7.4. CARDINALITY 175 


sets T;, are all pairwise digjomé and since —— = Sn we know that |T;,| < |S,| =n; by 


Theorem 7.17. Thus, | U S| =| UJ |= 3 |Ti| < oe by Theorem 7.25. 
i=1 i=1 i=1 i=1 


Theorem 7.27. Let AC B. If A is uncountable then B is uncountable. 


Proof. Assume A is uncountable and a € A. If B is countable then there is an onto function 
f :N—-B by Theorem 7.19. We can create a map g : B > A which is onto by setting 
g(x) = x if x € A and g(x) = a otherwise. Then go f : N > A is onto, which means that A 
is countable by Theorem 7.19. This is impossible, so we conclude that B is uncountable. 


Theorem 7.28. Let m,n €N. Then |{1,2,3,...,n} x {1,2,3,...,m}| =nm. 


Proof. Fix n € N. Then for any natural number i, the function g; : {1,2,3,...,n} x {i} > 
{1,2,3,...,n} defined by g;(k) = (i,k) for each 1 < k < n is one to one and onto. Since 
m 


{1,25 3,.:2;%} © 11,2,3,.4, 0} = U x {1,2,3,...,n} x {7}, we know from Theorem 7.25 that 
i=1 


HL sh) R112 |= Sn am. 


Definition 48 


We say |A| < |B] if and only if there is an onto function from B onto A (or, 


equivalently, a one to one function from A into B). We say |A| > |B] (or |B] < |A]) 
if |A| > |B and it is false that |B] = | A]. 


The following proof uses the Axiom of Choice, which is that every set can be ordered in 
a linear way (satisfying Trichotomy so that exactly one of a < b, b< a or a= 6 is true for 
each a,b in the set, and also Transitivity of order sp that if a < b and b < c then a < ¢ for 
any a,b,c in the set) which is well-ordered (meaning that every non-empty subset of the set 
under the order given has a least element). The process described for defining the function 
g is known as transfinite induction, and its validity is a consequence of the well-ordering 
and set specification. 


Theorem 7.29. Let A,B be sets so that it is not true that |A| > |B|. Then |A| < |B]. 


Proof. We use the Axiom of Choice to well-order A and B, creating a linear ordering of 
both sets in which every non-empty subset has a least element. We generate a function 
g:A-— B by assigning g(a) = bo where ap is the first element of A and bo is the first 


176 CHAPTER 7. SUPPLEMENTARY MATERIALS FOR ONE VARIABLE 


element of B. If we have defined g(x) for all x < y in A then we define g(y) to be the first 
element of B which is not equal to g(x) for any x < y. This choice is always possible since 
it is false that |A| > |B|, so we know that, for any y € A, with the ordering imposed by 
the well ordering assigned for A, that g restricted to the domain {x € A|x < y} is not onto 
(since otherwise we could define g(x) = z for all other points x € A and any point z € B 
and we would have an onto function g : A + B). Thus, for each y € A there is always 
some g(y) € B which is not the image of any x < y. Hence, g: A > B is one to one, so 
|B| > |A]. Since it is false that |A| = |B] it follows that |A| < |B]. 


Theorem 7.30. Cantor’s Theorem. Let S' be a set and let P(S) be the set of subsets of S. 
Then |P(S)| > |S}. 


Proof. Suppose there is a function f : S — P(S) which is onto. Let A= {x € S|x ¢ f(ax)}. 
Since f is onto there must be some y € S' so that f(y) = A. But then y ¢ A since if y € A 
then y ¢ f(y) =a by definition of A. On the other hand, if y ¢ A then y ¢ f(y) soye€ A. 
Since one or the other must be true if f is onto, and both lead to a contradiction, we have 
a contradiction. We conclude that it is false that f is onto, so |P(S)| > |S]. 


Theorem 7.31. Cantor-Schroeder-Bernstein. Let A,B be sets so that |A| > |B| and |B| > 
|A|. Then |A| = |B|. 


Proof. Step 1. Let D C A and assume there is a one to one function h : A > D. Then 
|Al = |D|. 

To see this, let C; = A\ D and then inductively define C,41 = h(C,,) for each positive 
integer n. We wish to find ¢: A + D which is one to one and onto. 


We define ¢(x) = h(x) if x € U C; = C and define ¢(x) = x otherwise. Notice that the 


i=1 
range of h is contained in D and A \ D = C, C C, which means that the identity map on 
A \ C also has range contained in D, so the range of ¢ is contained in D. 

To see that ¢ is one to one, let a,b be distinct elements of A. If a,b € C then since h is 
one to one ¢(a) = h(a) 4 h(b) = o(b). If a,b € A\ C then d(a) =a 4 b= G(b). IfaEeC 
and b € A\ C then ¢(b) = b and for some positive integer j we know that a € C;. Thus, 
o(a) = h(a) € Cj41 C C, which means that ¢(a) 4 ¢(b). Thus, ¢ is one to one. 

To see that ¢ is onto, let de D. If d ¢ C then ¢(d) = d. If d€ C then d € C; for some 
positive integer 7 > 1 (since d cannot be an element of C,), which means that d = h(s) for 
some s € Cj_1. It follows that |A| = |D]. 

Step 2: Let f: A— Bandg: B-A be one to one functions. Then go f is a one 
to one function mapping A to g(f(A)) C g(B) which means that |A| = |g(B)| by Step 
1. Hence, we can find a one to one and onto mapping w : A > g(B), so the composition 
g 'ow:A- B is one to one and onto, and hence |A| = |B]. 


7.5 More on Infinite and Sequence Limits 


Many theorems regarding infinite limits have a large number of cases, depending on whether 
the domain is approaching infinity or a number or negative infinity, or the range is approaching 


7.5. MORE ON INFINITE AND SEQUENCE LIMITS 177 


infinity or negative infinity or a number. This can sometimes give rise to nine cases for some 
theorems, which can be cumbersome. In some theorems these cases can be consolidated by 
the following idea, which still requires proof in multiple cases, but this way we will need to 
split fewer proofs into as many cases thereafter. 


Definition 49 


We say that x is an extended real number if x € R or x = o0 or x = —oo. Let 
DCR. We will define the extended i-neighborhood about an extended real number 


1 1 

x with respect to D, denoted Np(z,1), to be («4 — —,4+—)ND\ {x} if x € R, and 
i i 

(i,00) N D if & = co and (—oo, -1) N D if x = —oo. If D=R, we just write N(x, 7) 


instead of Ng(xz,7). We consider the extended real number oo to be greater than any 
real number and —oo to be less than any real number. We refer to the extended real 
number x as being an extended limit point of a set D if x € R and z is a limit point 
of D, or if x = co and D is not bounded above or x = —oo and D is not bounded 
below. In other words, x is an extended real limit point of D if Np(a,i) 4 0 for all 
ve. 


Note that we are not stating that co or —oo exist as part of a field containing the real 
numbers. We are using the convention oo is greater than any real number and —oo is be 
less than any real number for reference purposes. For instance, if a theorem says L > 0 and 
L is an extended real number then L could be a positive number or the symbol L could 
represent oo. 


Theorem 7.32. Let f: D—R. Then lim f(x) = L, where c, L are extended real numbers, 
LC 


if and only if c is an extended real limit point of D and for every k € N it is true that there 
is some j EN so that if x © Np(c,j) then f(x) € N(L,k). 


Proof. Case where where c € R and L € R: In this case, lim f(x) = L if and only if for 
xw—7C 
every € > 0 there is ad > 0 so that if 0 < |a—c| < 6 then |f(a)—L| < e. Thus, in particular, 
1 
for any k > 0 there is a 6 > 0 so that if 0 < |x —c| < 6 then |f(x) —L| < E Choose 7 € N 


1 

so that — < 6. Then if x € Np(c,7) it follows that f(x) € N(L,k). 

Conversely, assume that for any for every k € N it is true that there is some 7 € N so 
1 

that if « € Np(c,j) then f(x) € N(L,k). Let € > 0. Choose k € N so that E<é Then 
1 
choose j so that if « € Np(c,j) then f(x) € N(L,k). This means that if 0 < |~—c| < — 
J 


1 
and x € D then |f(x) — L| < E <6 80 lim f(x) = L. 
wL>C 
Case where c € R and L = oo (or —oo respectively): In this case, lim f(x) = L if, for 
xL—->C 


every M, there is ad > 0 so that if 0 < |a—c| <d and z€ D then f(x) > M (respectively 
f(x) < M). In particular, for any k € N we can find a 6 > 0 so that if 0 < |x —c| < 6 


178 CHAPTER 7. SUPPLEMENTARY MATERIALS FOR ONE VARIABLE 


1 
and x € D then f(x) > k (respectively f(a) < —k). Choose j € N so that j <6. Then if 


x € Np(c,7) it follows that f(a) € N(L,k). 
Conversely, for every k € N it is true that there is some j € N so that if  € Np(c,7) then 
f(x) € N(L,k). Let M € R and choose k € N so k > m (respectively —k < M). Choose j 


1 
so that if x € Np(c,7) then f(x) € N(L,k). If L = oo this means that if 0 < |x—c| < j and 


x € Dthen f(x) >k > M, so lim f(x) = oo. If L = —oo this means that if 0 < |r—c| < : 
and x € D then f(x) < —k < M, so lim f(2) = —00. 

Case where c = 00 (or —oo respectively) and L € R: In this case, lim (2) = Latand 
only if for every « > 0 there is an M € R so that if ¢ > M (or x < M respectively) then 
| f(x) — L| < e. Thus, in particular, for any 3 where k €N, there is an M so that ifx > M 


(or « < M respectively) then | f(a) — L| < - Choose j € N so that 7 > M (or —j < M 


respectively). Then if « € Np(c,7) it follows that f(x) € N(L,k). 
Conversely, assume that for any for every k € N it is true that there is some 7 € N so 


1 
that if ¢ € Np(c,j) then f(x) € N(L,k). Let € > 0. Choose k € N so that ee Then 
choose j so that if € Np(c,7) then f(a) € N(L,k). If c= oo this means that if > 7 and 
x € D then |f(x)—L| <Z <6 80 lim f (a ) = L. If c= —oo this means that if « < —j and 
(x 


i 
x € D then |f(x) — L| < — % <6 80 lim f = 

Case where c = oo and L = co or —0Oo eee In this case, lim fie) = BAG 
for every M € R, there is a B so that if ¢ > B and x € D then f(x) > M (respectively 
f(x) < M). In particular, for any k € N we can find a B so that if « > B and x € D then 
f(x) > k (respectively f(a) < —k). Choose 7 € N so that 7 > B. Then if x € Np(c,7) it 
follows that f(x) € N(L,k). 

Conversely, for every k € N it is true that there is some j7 € N so that if x € Np(c,J) 
then f(a) € N(L,k). Let M € R and choose k € N so k > M (respectively —k < M). 
Choose j so that if  € Np(c,j) then f(a) € N(L,k). If L = oo this means that if x > 7 
and x € D then f(x) >k > M, so lim f(x) = oo. If L = —c this means that if « > 7 and 
x € D then f(x) < —k < M, s0 lim f(x) = —o0. 

Case where c = —oo and L = © (or —oo respectively): In this case, lim Fla) = D4 
for every M € R, there is a B so that if ¢ < B and x € D then f(x) > M (respectively 
f(x) < M). In particular, for any k € N we can find a B so that if « < B and x € D then 
f(x) > k (respectively f(x) < —k). Choose 7 € N so that —j < B. Then if « € Np(c,j) it 
follows that f(x) € N(L,k). 

Conversely, for every k € N it is true that there is some j € N so that if x € Np(c,7) 
then f(a) € N(L,k). Let M € R and choose k € N so k > M (respectively —k < M). 
Choose j so that if x € Np(c,7) then f(z) € N(L,k). If L = co this means that if « < —j 
and x € D then f(z) > k > M, so lim f(x) = oo. If L = —oo this means that if x < —j 


and x € D then f(x) < —k < M, so lim f(x) = —o0. 


7.5. MORE ON INFINITE AND SEQUENCE LIMITS 179 


Theorem 7.33. Sequential Characterization of Limits for Extended Real Numbers (SCLE). 

Let c,L be extended real numbers so that c is an extended real limit point of D. Let f : 

D—>R. Lette N. Then lim f(x) = L if and only if for every sequence {xn} C Np(c,t), 
wL—->C 


if {tn} — c then {f(xn)} > L. 


Proof. First, assume that lim f(a) = L. Then for every natural number k there is a natural 
C 


number j so that if  € Np(c,j) then f(x) € N(L,k) by Theorem 7.32. Let {xn} C D 
with {x,} — c. Then for some k € N, if n > k then z, € Np(c,j), which means that 
f (an) € N(L,k) and hence {f(z,)} > L. 

Next, assume that for every sequence {z,} C D, if {a,} > c then {f(z,)} 7 L. 
Suppose lim f(x) # L. Then, by Theorem 7.32, there is some k € N so that for every 7 ¢ N 
it is true that there is some x € Np(c,j) so that f(x) ¢ N(L,k). For each n € N we choose 
In € Np(c,n) so that f(a,) ¢ N(L,k). Then {z,} > c, but {f(xn)} 4 L, which is a 
contradiction. We conclude that lim (ey. 


Theorem 7.34. Let f,g: D—R. Let c be an extended real number. If lim f(z)=seER 
x Cc 
and lim g(x) = t ER then for any a, 8 € R it is true that lim af(x) + Bg(x) = as + Bt, 
x Cc x C 


lim f(x)g(x) = st, and ift £0 and Nagom(£)(Ls4) is infinite for alli © N then im a = 
8 
t! 


Proof. Let {2} be a sequence of points in D so that {x,} — c. Then by the SCLE, 
we know that {f(zn)} > s and {g(zr,)} > t. Thus, {af(rn) + Bg(an)} 7 as t+ Pt, 


{f(an)9(an)} > st and ay) > ; ift £0 and {x} Cc dom (2) Hence, by the SCLE, 
we know that lim af(x) + Gg(x) = as + ft, lim f(x)g(x) = st, and lim f(z) = * itt # 0. 
@—c rc LC g(x) t 


Theorem 7.35. Let f,g: D— R, and let lim f(x) = co (respectively, —co), where c is 
zx Cc 


an extended real number. Let g(Np(c,t)) be bounded below (respectively, bounded above) for 
some natural number t. Then lim f(x) + g(x) = co (respectively, —co). 


Proof. Let {zn} C D with {rn} + c. Then by SCLE, {f(2n)} — o© (respectively —oo). 
Choose B so that g(Np(c,t)) is bounded below by B (respectively above by B). Pick s € N 
so that ifn > s then 2, € Np(c,t), so g(a) > B (respectively g(a,) < B). By Theorem 
3.28, we know that {f(zn) + g(an)} — co (respectively, —oo). Thus, by SCLE we know 
that Tim f(x) + g(x) = 0 (respectively, —oo). 


180 CHAPTER 7. SUPPLEMENTARY MATERIALS FOR ONE VARIABLE 


Theorem 7.36. Let f,g: D> R. 
Let c be an extended real number and let lim f(x) = © (respectively —co) and lim ea) = 
x Cc xc 


L € (0,00) or lim g(x) = co then lim f(x)g(x) = co (respectively, —oo) and lim —g(x) f(x) = 
—oo (respectively, co). 
In particular, lim f(x) = © if and only if lim — f(x) = —oo 


2M 
Proof. Let M > 0. Choose k € N so that k > Tz Tp lim g(x) = L € (0,00) then also 
wc 


1 L L 
choose k so — < 5 If lim g(x) = oo then choose k so that k > 5 Then we can find 
j € N so that if x € Np(c,j) then f(x) € N(co,k) (respectively, f(x) € N(—oo,k)) and 


g(x) € N(oo,k) if lim g(x) = oo and g(x) € N(L,k) if lim g(x) = L. Hence, f(x) > a 


(respectively f(x) < ) and g(x) > a so f(x)g(x) > M and —f(x)g(x) < —M 
(respectively f(x)g(x) < —M and —f(x)g(x) > M), which means lim f(x)g(x) = co and 
lim —f(x)g(2) = —oo (respectively lim f(x)g(a) = —oo and lim — f(x)g(2) = 60) 

By the preceding argument, setting g(a) = 1, we see that if lim f(2) = oo then 


lim — f(x) = —oo, and if lim f (2) = —oo then lim — f (2) = "80; 


Theorem 7.37. Let c be an extended real number and let f,g : D — R, where f is bounded 
on some Np(c,t) so that for some M > 0 it is true that |f(x)| < M is x € Np(c,t), and 
let lim g(x) = too. Then lim f(z) = 

LC LC g(x) 


Proof. Let {x,}C D and let {x,} > c. By SCLE we know that {g(x,)} — too and since 
for sufficiently large n we know that rz, € Np(c,t) we know that {f(a,)} is bounded by 


Theorem 3.27, which means that ee — 0 by Theorem 3.29. Thus, by SCLE it follows 
Jrn 
that lim fz) = 0. 
ze g(x) 


Theorem 7.38. Squeeze Theorem for Extended Real Numbers. Let f,g,h: D— R. Let 
lim f(x) = L, and let lim h(a) = L, where c and L are extended real numbers. If, for some 
a Cc zx Cc 


natural number t is is true that f(x) < g(x) < h(x) for all x € Np(c,t) then lim Cita beam Be 


Proof. We can find some j > t so that if x € Np(c,j) then f(x), h(x) € N(L,k). Since 
f(x) < g(x) < h(x) (and each N(L,k) is an interval) we know that g(x) € N(L,k), which 
means that lim g(x) = L by Theorem 7.32. 


7.5. MORE ON INFINITE AND SEQUENCE LIMITS 181 


Definition 50 


We use the convention that ifa > 0 and M =o and N = —o thenaM =o and 


aN = —oo. If b < 0 then we say bM = —oo and DN = oo. If ce R thenc+ M=co 
and c+ N = —oo. We also define MM = co, MN = —co, NN = co. 


Recall that oo, —oo are not real numbers, and we don’t have a multiplication operation 
involving them as part of a field containing the real numbers. The above notation is simply 
notation to make proofs and statements briefer. So, the symbols (2)(co) in a theorem 
statement would simply be read as oo. 


Theorem 7.39. Let f,g: D— R, where lim f(x) = M and lim f(x)g(x) = ML, where 
wa «ra 
M is a non-zero real number and a is an extended real number and L is an extended real 
number. Then lim g(x) = L. Furthermore, if lim f(x) + 9(z) = M+ L, where M € R, 
za wa 
then lim g(x) = L. 
@rw—a 


Proof. Let {tn} — a. By SCLE we know that {f(zn)} > M € R and {f(an) + g(tn)} > 
M +L and {f(rn)g(an)} 9 ML. 

If M+ € R then we know that L € R and that {g(z,)} — L by Exercise 3.8. Likewise, 
if {f(@n)g(a@n)} ~ ML € Rand M £0 then {g(z,)} — L by Exercise 3.9. Thus, by SCLE 
we know that the result is true in all cases where M+L¢€ Ror MLLER. 

Let M + L = o (respectively —co) where M € R. This is true if and only if L = oo 
(respectively —oo). Since {—f(x,,)} converges, we know it is bounded. Thus, by Theorem 
7.35 we know that {f(tn) + 9(@n) — f(tn)} = {9(2n)} > LZ, so lim g(x) = L by SCLE. 

Let ME = co and M > 0. Then L = oo. Let k € N. Since M > 0 we can find j ¢ N 


M M M 1 
so that ifn > 7 then oa Oa i f(x)g(x) > sae so ——~f(an)g(tn) = g(x) > 
2 3Mk 


k 

2° f(z) 
=~ ——— = k, which means that g(x) € N(co,k), so lim g(x) oo 

Next, let MZ = —oo and M > 0. Then L = —oo. By Theorem 7.36, we know that 
iim f(x)g(x) = —co if and only if jim f(x)(—g(x)) = c, which by the preceding argument 
is true if and only if iim —g(x) = co, which is true if and only if iim g(x) = —oo. 

Similarly, if we let ML = oo and M < 0 then we know that L = —oo and lim —f(x) = 
—M > 0, so we know that lim (— f(x))g(@) = —oo, which means lim g(x) eae by the 
preceding argument. 

Finally, if we let MLZ = —oo and M < 0 then we know that L = oo and lim —f(x) = 
—M > 0, so we lim (—f(2))9(«) = 00, which means lim g(@) = 0S; _ 


Thus, in all possible cases the theorem result is true. 


Theorem 7.40. Let c be an extended limit point of dom(f og). Let lim g(x) = L, where c 
x c 


and L are extended real numbers. If L is real and f(x) is continuous at L then lim Fla(z))= 
wc 


182 CHAPTER 7. SUPPLEMENTARY MATERIALS FOR ONE VARIABLE 


f(L) = M, a real number. If L = +00 and lim f(x) = M, an extended real number, then 
en 
lim f(g(0)) = M. 
If L is an extended limit point of the domain of f then lim f(g(z)) =M= = lim f(y f(y). 
wc yo 


Proof. This theorem is already proven for the case when c and L are real in Theorem 4.11. 

Let {tn} C dom(f o g) \ {c} so that {z,} — c. Then by SCLE we know that 
{g(an)} > L. If f is continuous at L € R we know that {f(g(an))} ~ M by The Sequential 
Characterization of Continuity. If L = too and iim f(z) = M then {f(g(an))} —~ M by 


SCLE. Thus, by SCLE we know that lim f(g(x)) = M. 
If L is an extended limit point of the domain of f(x) then by Theorem 4.4, lim fix) = 
yo 


M = lim f(g(c)) 


Theorem 7.41. Let {a,} be a non-zero sequence so that a, > 0 for sufficiently large n 
and {ay} — 0. Let {b,} be a non-zero sequence so that b, < 0 for sufficiently large n and 


1 1 
{b,} > 0. Then a) — oo and heed — —o0. 


1 1 
Proof. Let M > 0. Then we can find k € N so that ifn > k then ae 7; < by <0 <a < ivi 


1 1 
which means that b, <—M and a, > M and hence {—} > oo and {3 — —00. 
an n 


Theorem 7.42. Let lim f(x) = 0 for a non-zero function f with domain D. Then 
xL—->C 

lim f(x) = oo if f is positive in some Np(c,t) for some t € N, and lim f(x) = -o0 if 

x—->C wc 


f is negative in some Np(c,t) for somet EN. 


Proof. Let {x,} C D so that {x,} — c. Then we know that {f(x,)} — 0 by the Sequential 
Characterization of Limits for Extended Real Numbers. If f is positive in some Np(c, t) 
then f(2,,) is non-zero for all n and is positive for sufficiently large n, so by Theorem 7.41 we 


know that {Fey } — oo, which means Tim f(x) = co by the Sequential Characterization of 


Limits for ead Real Numbers. nee if f is negative in some Np(c,t) then f(x) 
is non-zero for all n and is negative for sufficiently large n, so by Theorem 7.41 we know 


that cm 


} — —oo, which means lim f(x) = —oo 


7.6 ‘Topology of the Real Line 


We have already discussed open and closed sets and limit points. This section discusses 
notions of compactness and connectedness, giving ways to formulate alternate proofs of the 


7.6. TOPOLOGY OF THE REAL LINE 183 


Extreme Value Theorem and the proof that a continuous function on a compact domain is 
uniformly continuous. 
Definition 51 


The set of open sets in a space (typically the real numbers of a subset of the real 
numbers for this text) is called the topology of the space. We say that a set C of sets 


is a cover of aset FE if EC We We say that C is an open cover of E if each element 


of C is an open set. We say that a set K is compact if for every open cover C of K 
there is a finite subset F’ of C which is also a cover of K. We refer to F' as a finite 
subcover (and we may say finite subcover of K or C, both meaning a finite subset of 
C that is a cover for kK). 


Theorem 7.43. Every closed interval {a, b] is compact. 


Proof. Let U = {Ua}aez be an open cover of [a,b] (where J is an arbitrary indexing set). 
Let S = {ax € [a,b]|[a,z] is covered by a finite subset of /}. Then a € S since a is ina 
member of U/ and S is bounded above, so S has a least upper bound I € [a,b]. Hence, for 
some 3 € J, we know that / € Ug, and thus for some ¢€ > 0 it follows that (J—,1+.¢) C Ug 
since Ug is open. By the Approximation Property, we can find some s € S so that s > 1—e, 
and a finite set F CU so that F covers [a, s] since s € S. Thus, FU {Uz} is a finite subset 
of U covering [a,! + €). Since / is an upper bound for S, we know that no point of (1,1 + €) 
is in S, which implies that | = 6 and hence F'U {U3} is a finite subset of U covering |a, b}. 
Thus, [a,b] is compact. 


Theorem 7.44. Heine-Borel Theorem. Let K CR. Then K is compact if and only if K 
is closed and bounded. 


Proof. First, assume K is compact. Let U, = (—n,n) for each n € N. Then {Un}nen is 
an open cover for K which has a finite subcover F = {Up,, Un, .--,; Un, }, where ny < ng < 


m 
. <Nm. Thus, K C U Un, = (—Nm;Mm) so K is bounded. 
= tl 
Suppose K is not closed. Then there is a point p € K\ K. Let V, = R\[p— net | for 


each n € N. Then {V,}nen is an open cover for K since p ¢ K, and this cover has a finite 
m 


1 1 
subcover G = {Vn,; Ving; +++) Vn; }, where ny < ng <... < nj. But U Va, = R\[p-— pt+—|. 
j 


p) 
te 
i=1 J 


1 1 
This is impossible because (p— —,p+—)NK #4 @ since p is a limit point of kK. We conclude 
n; n; 


j j 
that K must be closed. 


Finally, assume that K is closed and bounded. Then for some a < b we know K C [a,b]. 
Let C = {Ca}aez be an open cover of K. Then C U {R \ Kk} covers [a,b] so that cover has 


184 CHAPTER 7. SUPPLEMENTARY MATERIALS FOR ONE VARIABLE 


a finite subcover F’. Since no point of K is contained in R \ K, it follows that F\ {R\ kK} 
is a finite subset of C which covers K and therefore K is compact. 


Theorem 7.45. Let f: E > R. Then: 

(a) The function f is continuous if and only if, for every open subset V of R, the set 
f 1 (V) is open in E. 

(b) The function f is continuous at a point p € E if and only if for every open set V 
containing f(p), there is an open set U containing p such that UN E C f7'(V). 


Proof. (a) First, assume that f is continuous. Let V be open in R and let x € f~'(V). Since 
V is open there is an € > 0 so that (1x —e¢,2+e) CV. Since f is continuous there is a 6 > 0 
so that if |Ja—y| < 6 then | f(x) — f(y)| < € for all y € E. Thus (x—6,4+6)NEC fl(V), 
so by f~1(V) is open in E. 

Next, assume that the inverse of every open set is open in E. Let p € FE and let € > 0. 
Note that (p—¢,p+e) is open, so f~'((p—e,p+e)) is open in E, which means that there 
isa 6 >0so that (p—6,p+5)NEC f '((p—e,p+e)). Hence, if |x —p| <dandreE 
then | f(x) — f(p)| < €, which means that f is continuous. 

(b) First, assume that f is continuous at p © E. Let V be open in R containing f(p). 
Since V is open there is an € > 0 so that (a —¢,2 +c) C V. Since f is continuous 
there is a 6 > O so that if |x — p| < 6 then |f(x) — f(p)| < ¢ for alla € FE. Thus 
(p—6,p +d) NEC f(V). 

Next, assume that for every open set V containing f(p), there is an open set U containing 
p such that UME C f~1(V). Let € > 0. Note that (f(p) —, f(p) +) is open. Thus, there 
is an open set U so that UN E Cc f7'((f(p) — e, f(p) + ©)), which means that there is a 
6 >0so that (p—6,pt+d)NECUNEC f-l((f(p) —e, f(p) +6)). Hence, if |x — p| < 6 
and x € E then |f(x) — f(p)| < €, which means that f is continuous at p. 


Theorem 7.46. Let f : K — R be continuous, where K is compact. Then f(K) is compact. 


Proof. Let C = {Ua}acy be an open cover of f(K). For each a € J, since f~'(U,) is open 
in K we may choose V, open in R so that Van K = f~'(Uq). Then the {Va}ac,z sets covers 
ke and has a finite subcover Vg,,Vao,--; Va,. The corresponding open sets Uq,, Ua, .., Va; 
are a finite subset of C which covers f(k). Thus f(£) is compact. 


Definition 52 


We say that a continuous one to one and onto function f : E > K isa 


homeomorphism if f—~' : K — E is also continuous, in which case we say that E 
and K are homeomorphic (topologically indistinguishable, essentially). 


7.6. TOPOLOGY OF THE REAL LINE 185 


Theorem 7.47. Let f : K > R be continuous and one to one, where K is compact. Then 
f-1 : fUK) > R is also continuous. In other words, f is a homeomorphism. 


Proof. Let A be a closed set. Then AM K is closed since A and K are closed, and is bounded 
since K is bounded, so A is compact by the Heine-Borel Theorem. By Theorem 7.46, we 
know that f(AM KK) is compact and therefore closed. Since the inverse image of A under 
f-' is f(AN K) it follows that f~! is continuous on f(K) by Exercise 7.7. 


Theorem 7.48. The Extreme Value Theorem. Let f : K > R be continuous, where K is 
closed and bounded. Then there are points s,t € K so f(s) < f(x) < f(t) for everyxe K. 


Proof. By the Heine-Borel Theorem, K is compact, so we know that f(K) is compact by 
Theorem 7.46. By Theorem 3.22 this means that f(A) has a largest value f(t) and a 
smallest value f(s) for some s,t € K. 


Definition 53 


Let L be the set of limit points of a set E. The closure of a set E, denoted E 
is H UL. A pair of non-empty subsets A and B of a set E is a separation of E if 
AUB = E and the setts ANB=@9@= ANB. A set E is connected if it has no 


separation. If J is a bounded interval then we use the notation |J| to refer to the 
length of I (the right end point of J minus the left end point of J). We say a set 
DC Sis dense in S if D D S. We say that S is separable if S has a countable subset 
which is dense in S. 


Theorem 7.49. Let f : E — R be continuous and let E be connected. Then f(E) is 
connected. 


Proof. Suppose f(£) has a separation A,B. Then since A,B are disjoint, f~'(A) and 
f-1(B) are disjoint. Since AN B = @ and A is closed by Exercise 4.5, we know that 
f-1(B) = f7\(R\A is open in E. Likewise, f~'(A) is open in E. Since A, B are non-empty, 
f-'(A), f-+(B) are non-empty. Furthermore, if p € f~'(B) and a sequence {x,} C A and 
{an} — p, it follows that {f(¢n}— f(p) which means that f(p) is a limit point of A which 
is contained in B, which is impossible. Hence, f~'(B) contains no limit points of f~'(A) 
and likewise f~'(A) contains no limit points of f~'(B). Thus, the pair f~'(A), f~'(B) is 
a separation of £, which is impossible since FE is connected. 


Theorem 7.50. Let A,B be non-empty disjoint subsets of a set EF so that AUB = E. 
Then the pair of sets A,B is a separation of E if and only if there are disjoint open sets 
U,V so that UN E=A andVNE = B, which is true if and only if A is both closed and 
open in FE. 


186 CHAPTER 7. SUPPLEMENTARY MATERIALS FOR ONE VARIABLE 


Proof. First, assume that A,B form a separation of E. Then no point of B is a limit 
point of A and no point of B is a limit point of A, which means that for each x € A 


there is some €, > 0 so that (x — €,,2 + €,) MN B = @ and for each point y € B there 
Ex 


is some €, > 0 so that (y—«&,yte)NA =O. Let U = Ue@- gitt ae and 
xEA 
let V = Uw- sue o Then A C U and B C V. Suppose that z € UNV 


yEB 


€ € 
then for some xp € A and yo € B, |z — xo| < a and |z — yo| < ae Assume that 


2€x5 


€ € 
Ex) > €y- Then |x — yo| < |zo — z| +]2 — yo| < a »W< 


, which is impossible since 
(Xo — €x9,%0 + €x9) NB =. If €xy < ey) we arrive at a similar contradiction. We conclude 
that UNV = 9. 

Assume that there are disjoint open sets U,V so that UN E = A and VN FE = B. Then 
A and B are both open in E. Since B= E\ A=(R\U)NBandA=E\B=(R\V)NE 
we know that A,B are both closed in E. 

Assume that A is open and closed in E. Then since A is closed in EF, there is a closed 
set K so that KM E = A, which means that (R \ K)N E = B is open in E. Since A is 
open in £& there is an open set U so that UM & = A which means that each point of A is 
contained in an open interval that does not intersect B, so no point of A is a limit point 
of Band ANB = 9. Similarly, BN A = 0 since B is open in E. Hence, A and B form a 
separation of E. 


Theorem 7.51. Let E CR. Then E is connected if and only if E is an interval. 


Proof. First, assume F is not an interval. Then for some a,b € EF it follows that a<c<b 
and c ¢ E for some point c. But then (—oo,c) MN E,(c,co) N FE are a separation of E, so E 
is not connected. 

Next, assume that F is an interval and suppose that FE is not connected. Then F has 
a separation A,B. Define f : E > R by f(x) = 0 if # € A and f(x) = 2 if x © B. Since 
for the inverse image of every set under f is one of the empty set, A, B or all of E, all of 
which are open in FE, we know that f is continuous. By the Intermediate Value Theorem, 
it follows that f(c) = 1 for some c € (a,b), which is impossible. We conclude that E is 
connected. 


Theorem 7.52. Lebesgue Number Lemma. Let K be a compact set and C = {Ca}aes be 
an open cover of K. Then there is a number 6 > 0 so that if I is an open interval such that 
INK F490 and |x —y| <6 for all x,y € I then I C Cg for some 6 € J. 


Proof. Since C is an open cover of K, for each p € K we may choose €, so that (p— 2€p, p+ 
2€) C Ca, for some ap € J. Then {(p — €, p+ €p)}pex is an open cover of K with finite 
subcover F = {(pj — €p,,P + €p;) }1<i<m- Let 6 = min{ép, }i<i<m. Let J be an interval with 
[Z| <6 and let x € 1K. Then for some j we know that x € (pj — €,,pj + €p;) so ify eI 
then |p; — y| < |pj — z| +|z — y| < &, + 6 < 2ep,. Thus, I C (pj — 2€p,, pj + 2€p,) C Cap, 


7.7. L’HOSPITAL’S RULE 187 


Note that, in particular if x,y are points of K with |x — y| < 6 then since [z,y] is a 
subset of an element of the cover C it follows that both x and y are in one element of C. 


Theorem 7.53. Let f : K > R be continuous, where K is closed and bounded. Then f is 
uniformly continuous. 


Proof. Let « > 0. For each x € K choose 6, > 0 so that if y € K and |x — y| < 6, then 
\f(x) — f(y)| < = Note that C = {(a — dz, + 6,)|x € K} is an open cover of K. Since 


K is compact (by the Heine Borel Theorem) we know that there is a number 6 > 0 so that 
if x,y € K and |x — y| < 6 then for some z € K it follows that x,y € (z — 6z,z +62), so 


If(e) - F@) < ¥@ - F@+F@) - FOI <5 +5 =6 


7.7 L’Hospital’s Rule 


We will be using the notation developed in the section ” More on Infinite Limits” in the 
following argument. 


Theorem 7.54. L’Hospital’s Rule. Let f,g: I > R be differentiable on an interval I and 
let g'(x) £0 for all x € I\ {a} where a is either an element of the open interval I or an end 


point of I ora = oo and I is not bounded above or a = —co and I is not bounded below. Let 
/ 
lim f(x) = 0 = lim g(x) and lim f(a) = L or let lim f(x) = +00 and let lim g(x) = +00 
Za ra ta g (x) Za ra 
/ 
and lim f (2) = L, where L is an extended real number. Then lim f(z) = Be 
aa g!(x) aa g(x) 


Proof. First, we observe that we proved this theorem for the cases where EL € R and 
lim f(x) = 0 = lim g(a) in chapter 5. 
«wa wa 


/ 
Next, assume that lim f(x) = too and let lim g(x) = +00 and lim f (2) = L, where a 
@—a @—a ra g (x) 
and L are extended real numbers. 


Choose ko € N so that if « € N7(a,ko) then |g(x)| > 0. Choose a point s; € Nr(a, ko). 


Since lim g(x) = +oo we can find ki > ko so that if t € Ny(a,k,) then a < 1 and 
a a g 
a <1. Let {tr} C Nr(a,k1), where {t;} > a. We then find kp > k, so that if 


t 1 t 1 
t € N7(a,k2) then lg(tu)| < d f(t)! < —. For some integer m, it is true that if 


an 
lg(t)| 2 lg(t)| 2 
n > my, then t, € N7(a,k2). We set s; = 5; if 1 <i < m,. Note that nan < 1 and 
Gi 


f(s loll 1 ng 
t 


<1lifl<i<m,. We then find k3 > ko so that ift € N7(a, k3) then 


lg] 3 


1 
< 3" We then find mz > m, so that ifn > mz then t, € N7(a,k3). We define 5; = ty 


188 CHAPTER 7. SUPPLEMENTARY MATERIALS FOR ONE VARIABLE 


lols] 1g Leoal - 


ifm, <i < mg. Note that ge an 
lg(t)| 2 lat) ~ 2 
and f(s;) = f(t1) if my <7 < me). We then find ky > kgs so that if t € N7(a,k4) then 


Le my <i < mez (since g(s;) = g(t1) 


lg(ts)| 1 lf(ts)| — 1 

< — and < —. We then find m3 > mz so that if n > mg then t, € Ny7(a, ka). 
lo) © 4°"° Toth] <4 eee 
We define s; = tg if mg <i < m3. Note that lg(ss)| < — and LF(si)| <— if mg <i< ms. 


lg(ts)| 3 lg(ti)| 38 
We continue in this manner, creating a sequence of points {s;} with a corresponding 
increasing sequence of natural numbers {m;} so that s; = t; for all ae <4 < Mj41, “ 


the m; integers chosen so that whenever 7 > m, it is true that l9(ss)| <; q Hflsa| 20 ai 
lg(ti)| 1 pte j 

Then {s;} > a (since for i > m; we know s; € el ee N(a, a io(t pm — 0 and 

Lf(si)| — 0 (since for i > m,; we know that 0 < lols) and 0 < Lf(si)| < sy: 

lg(ti)| lot) j l(t] 9 


By the Cauchy Mean Value Theorem, for each n € N we can choose c, between s,, and 


! Zale ee = ths Fen) _ Ftn) = Pn) _ 
tp SO that f (Cn) (g(tn) g( n)) g ( n)(F (tn) P( n)): en g ‘CG, g(tn) — g(Sn) 


(fn) 9ln) _ (fF (tn) _ f(sn) 1 (Cn) 1 
qe ~ AG.) = From ie) Mo =a ). Since { den)" — L, and { ~ Wea! — 1, by 
gtr) —_g(tn) ~ g(tn) g(tn) 
Exercise 3.9, we know that (ile) (sn), > L. Since { Msn) — 0, by Theorem 7.39 we 
5 g(tn) — g(tn) g(tn a 
lim 2 = 
ain) Ysa (a) 


Theorem 7.55. L’Hospital’s rule can be extended to sequences. Let f,g : [1,oo) ~ R 

be differentiable functions, where g is non-zero. Let {f(n)},{g(n)} be sequences defined 

by restricting f and g to the natural numbers. Assume lim e,) == lim Fei) OF 
n—7Co n—7Co 


/ 
lim f(t) = +00 and lim g(a,) = too. Also assume that lim Fe) Se: Then at 
n—00 n— 00 Noo g (x) 
follows that lim fn) =f 
Proof. By L’Hospital’s rule, we know that lim a“ L. Thus, by the Sequential Characterization 
xL—>CO Qe 
of Limits for Extended Real Numbers, we have lim oe =f 
n-400 g(Lp 


Usually, when we use L’Hospital’s rule for sequences we above we tend to abuse notation 
by not observing that the sequences can be extended to functions on intervals and then using 
this theorem to conclude that the limit of the ratio of the sequences is the same as the limit 
of the ratio of the functions. Instead, we tend to treat n as a variable representing any 
number in [1, co), use L’Hospital’s rule and then treat the limit as the limit of the sequence 
itself afterwards. It should be understood that this is what we are doing, however, since 


7.8. MORE ON INTEGRATION 189 


otherwise it doesn’t make much sense to use L’Hospital’s rule (since a function defined only 
on the integers is not differentiable at all). We don’t normally quote this theorem, and 
instead just use it without mentioning it. 


7.8 More on Integration 


Change of Variables for Single Variable Functions: 


Theorem 7.56. Let ¢: U > R be C' on [a,b], where U is an open set containing 
[a,b] with o'(x) #0 on [a,b]. 

(a) There is an interval [c,d] so that ¢([a, b]) = [c, d]. 

(b) &* has continuous non-zero derivative on {c, d]. 

(c) There is an M > 0 so that for any two points x,y € |a,b], |¢(a) — o(y)| < 
M|x —y| and for any two points x,y € [c,d], |@- (2) — ¢ "(y)| < M|z —- yl. 

(d) If D C {a, | and A(D) = 0 then A\(¢(D)) = 0. If EF C |e, d] and \(E) = 0 then 
Mg"([e,d))) =0. 

(e) Let f be integrable on |c,d|. Then f 0 @ is integrable on {a, bj. 


Proof. (a) First, 6(E£) is a closed interval [c,d] by Exercise 4.13. 

(b) By Theorem 5.26, we know that ¢ is strictly monotone and by the Inverse 
Function Theorem we know that @~' is differentiable with a non-zero derivative 
I) on [c,d]. Since ¢'(x) is continuous and positive, and ¢~'(z) is continuous, 
it follows that the derivative of ~'(a) is continuous on [c, d]. 

(c) Let W = ¢'. Since ¢’, y’ are non-zero and continuous on [a, b], by the Extreme 
Value Theorem these functions are bounded, so we can find M so that M > |¢'(x)| 
for all x € [a,b] and M > |w'(x)| for all x € [c,d]. Hence, by Exercise 5.10, we know 
that for any x,y © [a,6],if ¢ A y then |d(x) — o(y)| < M|x —y| and if z,y € [cd 
then |g~"() — @"(y)| < M|x — yJ. 

(d) By (c) we can find M > 0 so that for any two points x, y € [a, 6], |o(x)—(y)] < 
M|zx — y| and for any two points 2, y € [c,d], |-*(x) — @ '(y)| < M|xz —y]. 

Assume A(D) = 0 for some D C |a,b]. Let € > 0 and choose a collection of closed 


intervals {J }ien covering D so that S- [Ii] < a By (a) we know that each ¢(J;) is a 
i=1 

closed interval and for any two points zx, y in I; we know that |¢(2)—@(y)| < M|x—yl, 

which means that $(J;) is a closed interval whose end points are closer together 

than M|J;| and therefore |@(/;)| < M|J;|. Hence, {@(Ui)}ien is a cover of @(D) and 


S- loli) | < MS, [Fi < M— =e. Thus, \(¢(D)) = 0. Since ¢~! is also a C' one 
i=1 i=1 
to one function by (b) it follows that if \(£) = 0 for some E C [c,d] then @'(E) 
also has Lebesgue measure zero. 

(e) Let Dy be the set of points of [c,d] at which f is not continuous. We know 
that A(Dr) = 0 by the Lebesgue Characterization of Riemann Integrability, which 
means that A(@~'(Dy)) = 0 by part (d). If x € [a,b] \ d'(Dy) then ¢ is continuous 
at « and f is continuous at f(x), which means that the set Dyog of all discontinuities 


190 CHAPTER 7. SUPPLEMENTARY MATERIALS FOR ONE VARIABLE 


of f od in [a,b] is a subset of d-'(D rf) of measure zero, from which we conclude that 
f o¢ is integrable on |a, }. 


The following is a stronger form of the substitution result for compositions of 
integrable functions with continuously differentiable functions. 


Theorem 7.57. Change of variables, single variable case. 
Let @ be continuously differentiable on E = [a,b] with ¢'(x) #0 on [a,b]. Let f be 


an integrable function on ({a, b]). Then [i 0 g|¢'| = shee f3 


Proof. First, note that fod is integrable on EF’, and we can find M > 1 so that for any 
two points x,y € [a,b], |o(x) — d(y)| < M|x — y| and for any two points x,y € [c,d], 
lo ‘(x) — @"(y)| < M|x — y| by Theorem 7.56. This also means that f o ¢|¢’| is 
integrable on [a,b] by Theorem 6.11. 

By Theorem 6.6 we can find a 6 > 0 so ua if P is a partition of [a,b] with 


|P| < 6 then U(f 0 4|¢'|, P) — L(f sale) P)< 
that |Q| < 6 then U(f,Q) — L(f,Q) < - 


By 5.26, we know that ¢ is strictly monotone. For now, we will assume that ¢ is 
increasing and so |¢'(x)| = ¢'(x) > 0 on [a,b] since it is impossible for ¢'(x) < 0 for 
any «x for an increasing function by Exercise 5.12. 


5° and if Q is a partition of [c, d] so 


é 
Choose a partition P = {po, pi, p2,.--,Pn} of [a,b] so that |P| < Vi Let Q = 
{do; U1; 92; +5 Qn} be a partition of [c,d] where gq; = ¢(p;) for all 0 <i < n. Since 
6 
lai — G—-1| < M|p; — py] < My = 6, we know U(f 0 d|¢’|, P) — L(f o d|¢'|, P) < , 
and U(f,Q) — L(f,@Q) < «. Thus, any Riemann sum of f o ¢|¢’| with respect to 
b 


€ 
partition P has distance less than 5 from fo ¢|¢’|, and any Riemann sum of f 


d 
€ 
with respect to partition Q has distance less than 5 from ve 


By the Mean Value Theorem we can find a marking R = {p; }i<j<, so that p; € 


(pi-1, ps) and $'(p;)(pi — pia) = O(pi) — O(pi-1) = G — Ga for all 1 <i<n. Let 
ie = {0(p;) hi<i<n, which is a marking of Q since ¢ is increasing. Then S7(f,Q) = 


aC (p3)) (qi — d-1 -Yse (p3)¢ — pi-1) = Sr(f o¢|¢'|, P). 


It follows that a a fodld'|| < nif f — Sr(f,Q)| + |Sr(f,Q) — Sr(f o 


o|¢’'|,P)|+|Sr(f o¢|d'|, P) - if fodld'||< = 5 +0+5=6 Since this is true for all 


€ > 0 we conclude that ie -{ fodld¢'|. 


7.8. MORE ON INTEGRATION 191 


If ¢ is decreasing the proof is similar. We have ¢'(a) < 0 on [a,b] and we choose 
P and Q as before except that q; = ¢(pn_i). As before, we can find a marking R = 


{Dj }i<i<n so that pj € (pi-1,pi) and $'(p;) (pi — pi-1) = (Di) — O(Pi-1) = Gn-i— n-4H1 
foralll <i<n. Let f= pe which is a marking of Q since ¢ is decreasing. 


Then Sr(f,Q) = Heh #(D;)) (Gi — G-1 us (0; ) 6 (P;) (Pi — Pit) = —Sr((f° 
6)¢', P) = Sr((f od), P) since |¢"| = 
Thus, as before, ve -f fodd'||< aif f —Sr(f,Q)| + |Sr(f,Q) — Sa(f o 


o|¢'|, P)| +|Sr(f o ¢|¢'|, P) - a fodld'|| <= 5 +0+5 = ¢, and since this is true for 


all € > 0 we conclude that [i -f fodld'l. 


Improper Integration: 

Definite Riemann integrals are only defined for bounded functions over closed, 
bounded intervals. We can extend the idea of an integral to unbounded intervals or 
to functions whose ranges are unbounded with the idea of an improper integral, which 
is the limit of a definite integral or the sum of multiple such limits. 

If one bound of an integral is infinity (or negative infinity respectively) then we 
can replace that bound by a variable, evaluate the resulting integral as normal and 
let the bound approach infinity (or negative infinity respectively). If the resulting 
limit exists then we say that the improper integral exists (converges) and is equal to 
the specified limit. If not then we say that the improper integral diverges (it does not 
exist). If the interval over which the integral is taken is from negative infinity on the 
lower bound to infinity on the upper bound then you must take a point in between 
and split the integral into two integrals each of which only approaches an infinite 
limit at one bound. If either integral diverges then we say the integral is divergent. 
If both converge then the integral is the sum of the resulting integrals. 


Definition 54 


Let f : D> R. If [a,oo) C D then we say that the improper integral 
oo b 
i ae he dim / f(x)dx if this limit exists. 
a vies a 


If (—co,a] C D then we say that improper integral / Cee 


lim f(x)dx if this limit exists. 


b——oo 


If (c,d] C D and f is not integrable on [c,d] then we say that improper 
d 


integral / Tee — Jim [40 f(«x)dx if this limit exists. 
it |@,¢) GD and f. is not integrable on [a,c] then we say that improper 


192 CHAPTER 7. SUPPLEMENTARY MATERIALS FOR ONE VARIABLE 


Cc 


integral [| f(x)dx = Jim 1 f fe) f(x)dz if this limit exists. 
If an improper cea integral of f exists on an interval D is equal to I 


then we also say that the improper integral converges to J, and write | f=rL 
D 
If the improper integral does not exist we refer to the improper integral as 


divergent. 

If D is a finite union of intervals, no two of which intersect at more than 
one point, on which the improper integrals of f all exist, then the sum of the 
improper integrals of f on each interval is defined to be the improper integral 
of f on D. Otherwise, the improper integral of f on D does not exist (it is 
divergent). 


Pe ail 
Example 7.1. Find the improper integral dl de. 
1 & 


b 148 
Solution. Our definition of improper integral is that this is lim — da = lim —5| = 
b-00 1 2« b> 00 DoF Ne 

1 2 1 


po oe oS 


(oe) 


Example 7.2. Find the improper integral e "dz. 


Solution. We pick a point to separate the integral into a sum of two integrals which 


are improper at only one bound. We will choose ; e *dx+ | e “dx. Integrating 
—oo 0 


love) b 
we have i: e “dx = lim e “dx = lim -e |) = lim —e? +1=1. 
0 


b-00 Jo b- 00 b-00 
0 
: : = ; a0 ‘ 7 
For the other integral we have lim e “dx = lim —e *| = lim e®-1= 
b>—oco Jy b——oo b b——oo 


CO. Thus, [ e “dx diverges. 


(oe) 


In the case where there is a finite set of points at which the integrand approaches 
infinity or negative infinity, we separate the integral into a sum of integrals over 
subintervals of the interval over which the original integral is taken, each of which is 
only approaching infinity (or is infinity) at one bound of the integral and at no other 
point on the subinterval over which each integral defined. For each integral where 
one bound is a point at which the integrand approaches infinity or negative infinity, 
we replace that bound by b and take the limit as 6 approaches the original bound 


7.8. MORE ON INTEGRATION 193 


from the direction so that b is in the interval over which the integral was taken. If 
any of these new integrals diverges we say the original integral diverges. If all of 
them converge then the sum of the integrals on these subintervals is the value of the 
original integral. 


It is sometimes helpful to compare two functions and determine the convergence 
or divergence of the integral of one function’s improper integral by comparing it to the 
other. In order to handle all cases at once, we will use some of our development from 
the limits at extended real numbers from earlier in the Supplementary Materials. 


Theorem 7.58. Let D be a an interval in the domain of f and g so that either D 
has finite length and f and g approach +co at exactly one end point of D, or D is 
either unbounded above or below (but not both) and f and g do not approach infinity 


at any point of D. Let0 < f(a) < g(x) on D. Then tf g(x)dxz = L, a finite value, 
D 
then i, f(x)dx = M, a finite value so that M < L. Likewise, vf f(x)dx = co then 
D D 


g(x)dx = co. 
D 


boc 


Proof. We know that : 7 = iim i; f, where c is an extended real number and 
D D(b) 


D(b) is an interval contained in D, one of whose ends is c, and the other is the other 
(finite) end point of D at which f and g do not approach infinity. Since f and g are 
non-negative on D we know that if b2 is closer to c than b; is to c (where what is 
meant by “closer” c = oo for this proof is that bo > 6; what is meant by “closer” if 


c= —00 is that by < By) then f ref f since | ff f= [f20, 
D(b1) D(b2) D(b2) D(b1) I 


where J is the interval whose end points are b; and by. Likewise, g< if g. 
D(b1) D(b2) 


Choose a sequence {x,} C D so that {z,} — ¢ and for each natural number n it 
is true that x,11 is closer to c than xp. 


First, assume that | g(x)dx = L. Then by the Sequential Characterization of 
D 


Limits for Extended Real Numbers, we know that { i g} + L, where { i; g} 
Dye Dist 
is a non-decreasing sequence whose supremum is L. The sequence { | f} is also 
Des 


non-decreasing and is bounded above by L, which means that { fhr>Me= 


Dig 
sup f for some M < L. This means that given any € > 0 there is an x, so that 
neN Deon, 

M—- f <e. If b in the interior of D and is closer to c than x, then there is an 


Dz 


n 


194 CHAPTER 7. SUPPLEMENTARY MATERIALS FOR ONE VARIABLE 


x; closer to c than b, which means L —e€ < - is | r= f < L, so it follows 
D Dy Ds, 


that lim f=M, 


Finally, assume that | f(x)dx = oo. Then by the Sequential Characterization of 
D 


Limits for Extended Real Numbers, we know that { | f} — co. By The Squeeze 
gs 


Theorem for Extended Real Numbers we know that { | g} > co. This means that 
Tes 


given any R > 0 there is an x, so that | g > R. If b in the interior of D and is 
D 


rn 


closer to c than x, then R < i g< | g, so it follows that lim G =: 
D Ds be J D(b) 


It is sometimes helpful in both series, and in many infinite limits or limits at 
infinity to note the following relative sizes of increase. We will refer to this as the 
“Order of Bigness List” (it has no formal name that is generally accepted, so we will 
use this silly name which we hope is unlikely to be duplicated as meaning something 
else in another context). Some aspects of the associated theorem we prove here are 
useful now, and others are useful after we cover series. To describe them it is helpful 
to introduce a new notation. 


Definition 55 


We say that a function f(x) converges at a rate less than that of a positive 
constant times g(x), a function which is positive for sufficiently large values 


of x, or that f(x) is big O of g(x), written f(x) is O(g(x)) if there is some 
k € Nand c > 0 so that |f(x)| < cg(x) for all x > k. This is also written 
f(x) = O(g(@)). 


Note that while the “=” sign is part of an acceptable notation for the “big O” 
notation described above, it does not mean equality in this context. It is immediate 
that if f is O(g(#)) and h(x) > g(x) for all sufficiently large x then f is O(h(x)). 


Theorem 7.59. The Order of Bigness List. On domain {1,00), for the following 
(ordered) list of functions: In(x), x? where 0 < p< 1, x4 whereq > 1, r* wherer > 1, 
lel! ond ws 

(a) Each function on the list approaches infinity as x approaches infinity. 

(b) If a function f precedes (is listed earlier in the list than) a function g then 


lim f(z) = 0. It is also true that lim = 
L— 00 (x) L300 In(z) 


7.8. MORE ON INTEGRATION 195 


(c) With the exception of the pair consisting of In(x) and x? where 0 < p< 1, and 
the pair x? where0 <p<1andxz!’ whereq>1 ifq—p <1, if f precedes g in this 


list then there is a number k > 1 so that f(z) is O(—). 
x 


g(x) 


Proof. By Theorem 6.25, In(a) is increasing and is not bounded above, which means 
that for any M > 0 there is some k > 0 so that if > k then In(x) > M and therefore 


lim In(xz) = oo. 
to Ae, @) 


Next, we look at x? for 0 < p. Taking the natural log, we see get pln(x). Since 
p > 0 and lim In(z) = oo, by Theorem 7.36. Hence, given any M > 0 we can find 


xw—->0O 
k >1so that if 2 > k then pln(x) > In(M) and thus e? ™™ > M, so x? > M, which 
means that lim In(x) = oo. 

«LOO 


Next, if r > 1 then note that In(r) > 0 by Theorem 6.25, so (r*)’ = r*In(r) > 
which means that r” is increasing. 


In(M In(M) In(M) 4p 
Let M>0. Ifx> : it follows that r? > r= =emo ™™ — —(™) — yy by 
n(r 


definition. Thus, lim r* = oo. 
«LOO 


Next, note that |x|! > (nm — 1)! on for each natural number n > x. Let M > 0. 
Since N is not bounded above we can find k > M+ 1, so if x > k then |x|! > 
(k — 1)! = (k-1)(k —2)...(1) > M. Thus, Jim |e]! =O: 

Finally, for any M > 0 we can pick k > max{1,M}. If « > k then xIn(x) > 
xIn(k) since x > 0 and In(z) is increasing. Since e® is increasing, this means that 
et M2) < elk) 55 g® > k®. We have already established that Jim ko == 0680) by: 


the Squeeze Theorem for Extended Real Numbers we see that lim x” = oo 


to de, 0) 
(b) and (c) Since we know that lim In(z) = oo, it follows from Theorem 7.37 that 
xL—>CO 
li —_ 0 
wed In(z) a 
_ In(x) ae 
Let p > 0. Using L’Hospital’s Rule, we see that lim = hint a 
too | =O P x00 pxP-! 


11 
lim ~— = 0 by Theorem 7.37 since we know that lim px” = co from part (a) and 
to px I-00 


Theorem 7.36. ‘ F 
Let q>1and0<p< 1. Then lim hin = 0 by Theorem 7.37 since 


zoo 74 zoo eI P 


yP 
we know that lim x7? = oo from part (a). In the case where gq — p > 1 then =a is 
Z—> 00 x 
O( ; =) where g—p > 1. Likewise, if g > 1 then we can pick p so that 0< p< q-1, 
ce 


l l i 
ae) = ne) —, and for sufficiently large x we know that 
ny 


<1, 


and notice that 


In(x) . 1 
sO a 1S erm) where qd—p > 1. 


In(x) 


ra xP 


Let p > 1 and q > 0. Then we can find a natural number m so that m—1 < q <m. 
q — 1)? 
By L’Hospital’s rule, lim 2 lin ie La and so on, until the numerator no 
rro0 p® — x00 p*(In(p)) 


196 CHAPTER 7. SUPPLEMENTARY MATERIALS FOR ONE VARIABLE 


=A egal 
longer approaches infinity, so the limit is equal to lim gla Daild a) = 


1400 p* (In(p))™ 
by Theorem 7.37. 
Since q was an arbitrary positive number we can replace q by q+ 2, so we can 


gate x4 £ 1 
choose k so that if x > k then = ¢?— < 1, which means that — < —. hus, 
x4 1 In(x i ane 
— is O(-,). Likewise, since (2) ee = for sufficiently large x for 0 < r < 1, 


l : 1 
it follows that mall and — are also O(>5). 
pr ae 

Next, let p > 1 and choosem € Nsom > 2p+1. Thenifm+k<a2<m+k+1, 
x k+1 m 
Pp Pp Pp 
> k k—1)... 1)m! < = 
Prades Ti) R= Ti TOT eae ciae p=), cea 1) rt 
D D D pr 1 1 
( J )a( )( 


mr 
k(P 1, 

. Wek that {(=)"} > 0b 
m+k-m+k—-1° ‘m+1" m! a) Gar? ee oe - 
Exercise 3.13, so for any € > 0 we can choose t € N so that ifn > t then (5)" < 
p 1 got x 


) < € and hence lim 


mile 
prt . 


Thus, if « > ¢ then = (0. This also 


establishes that ~~ is O(= ). Since we also know from the preceding paragraph 


|x|! Z 


Le 1 
that . is O(— ) and In(x) < x" < 2’ < p” for sufficiently large x, this means that 
ue 


Tr qd 1 
Be) = and — are each O(-5). 


Omen tena enon Welt. le)Mle| — 2} -2)..0) 
Finally, i" ca - = 7 @) OE) Hl) 

hy ht) ey) < 5.0 is 05). 

Since we have shown In(z) < 2” < 27 < p® < x” for0O <r <1,q> 1 and 

In(e) a af oO 


n 
p > 1 (for sufficiently large x) we conclude that each of ——, —, — = are each 
rammed x 


and 


Lebesgue Characterization of Riemann Integrability: 


Definition 56 


We say that a set S has Lebesgue measure zero, denoted (S) = 0 if and 
only if for every € > 0 there is a sequence of open intervals {J;} which covers S 


so that Se |I;| < € (where |J;| denotes the length of interval J;). 
=i 


7.8. MORE ON INTEGRATION 197 


Let f : D +R be bounded. For each non-degenerate interval J we define the 
oscillation of f on I to be Off = sup f(x) — inf | f(2). If p € & then we 
xeEIND de 


E 
define the oscillation of f at p to be w¢(p) = jim, OQ(p —h,p +h). 
—' 


Theorem 7.60. Let AC B and let \(B) =0. Then \(A) = 0. 
Proof. Let ¢ > 0. Then there is an open cover of B by a countable collection of open 


intervals {J,,} so that 3 [In| < €. Since {J,} is also a cover for A it follows that 


n=1 


d(A) = 0. 


Theorem 7.61. Let E = {pj, po, p3,...} be countable. Then A(E) = 0. 


+ Then {J;} covers E and Soil = €, sO 


i=1 


Proof. Let I; = (p; 


€ 
= pire Pi yan) 


M(B) = 0. 


Theorem 7.62. Let \(E;) =0 for eachi ce N. Then MU EB) =, 


i=l 
Proof. For each i € N choose open intervals {J(in)}nen which cover E; so that 


Lizny)| < ° . Then Tg.4)| < € and {1(.;)}¢;en is a cover for | }] £;, which 
( ’ ) Di+1 ( J) ( J) J 


n=1 i=1 j=l i=1 


has Lebesgue measure zero. 


Theorem 7.63. [f we remove “open” or replace “open” by ”closed” in our definition 
of Lebesgue measure zero and the definitions would be equivalent. 


Proof. Let S be a set and let « > 0. First, assume A(S') = 0. Then we can find a 


countable cover of S by open intervals {J;};c so that x |I;| < e. If we add the end 
i=1 
points to each J; the sum of the lengths of the intervals is unchanged and the intervals 
still cover S. 
Next, assume that for every « > 0 we can find a countable cover by closed intervals 


(or simply intervals) {J;}ien so that > |I;| < €. Then choose intervals {A;} so that 
i=l 


> |A;| < - Let the left and right end points of A; be a; and 6; respectively. Then 
i=1 


198 CHAPTER 7. SUPPLEMENTARY MATERIALS FOR ONE VARIABLE 


€ € 


define V; = (a; — pia bi aia) Then {V;};:en is a countable collection of open 


intervals which covers S so that S- |V;| <€, so AVS) = 0. 


i=1 


Theorem 7.64. f : D — R be bounded and let I,, Iz be intervals with I, C Ig. Then 
Opis Oyle 


Proof. We know sup f(x) < sup f(x) and inf f(x) > inf_ f(x) by Exercise 
xElND xEIgnD xeEl,nD x€IgND 
1.17, which means Qfl) < OfLo. 


Theorem 7.65. f : D — R be bounded and let p € D. Then wy(p) = int. O-(p — 
€ 
h,p+h)>0. 


Proof. Note that Q;(p — h,p + h) is a non-negative real number for each h > 0 
since {f(x)|z € (p — hp + h)} is non-empty and bounded if p € E. Let w = 
inf, Q(p—h,pt+h). Then w > 0 since each Q(p —h,p +h) > 0. Let € > 0. Then 
€ 


for some 6 > 0 we know that Q;(p—6,p +6) < w+e by the approximation property. 

However, we also know that if 0 < h <6 then Q;(p—h,p+h) < O;(p—6,p+6) by 

Theorem 7.64, so |Q¢(p—h, p+h)—w| < € and w = wy(p) = jim, O-(p—h, pth). 
> 


Theorem 7.66. Let f : D > R be bounded and let p€ D. Then f is continuous at 
p if and only if ws(p) = 0. 


Proof. Assume f is continuous at p and let « > 0. Choose 6 > 0 so that if |a — p| < 
6 and x € D then |f(x) — f(p)| < 7 Then sup f(x) < f(p) +5 and 


x€(p—6,p+6)ND 


€ 
i > —-. Thus O-(p — = < ¢ for all 
vty habe > f(p) 5 us O-(p — 6,p +4) < € so wy(p) < € for all € > 0, 


which means that wy(p) = 0. 
Assume that wy(p) = 0 and let € > 0. Then since wy(p) = inf, Op —h, pth), 
e 


by the Approximation Property we can find 6 > 0 so that Q;(p —6,p+6) < €, which 
means that if |z — c| < 6 and x € D then |f(z) — f(c)| < €, so f is continuous at 
ce 


Theorem 7.67. Let K be a closed set, and f : K — R be bounded, and let E, = 
1 

{x € K|ws(x) > —} . Then E,, ts closed. If K is compact then E,, is compact. 
n 


7.8. MORE ON INTEGRATION 199 

Proof. Let {z,} C E,, where {z,} — p. Then for any h > 0 we know that (p — 

BP - >) contains x,, for some m € N. Since — < ws(%m) < Q¢(@m — =, lm + 
n 

Q(p —h,p +h) by Theorems 7.65 and 7.64, it follows that wr(p) > Hence, E,, 

contains all of its limit points and is closed. If K is compact then E,, is also bounded 

and thus compact by the Heine-Borel Theorem. 


= | po | 


Theorem 7.68. Let f : D + R be bounded and let p € D ande > 0. If w;(p) < € 
then there is ad > 0 so that Q¢(p—6,p +) <e. 


Proof. This follows directly from the Approximation Property since we know that 


= inf Q(p— h). 
Ww (p) ee r(p —h,p +h) 


Theorem 7.69. Let f : K > R be a bounded function, with K a compact set. Let 
€>0 and wy(p) < € for each p€ K. Then there is ad > 0 so that if I is an interval 
so that |I| <6 and INK £0 then OyI < €. 


Proof. By theorem 7.68 for each p € K we can find ae, > 0so that Q¢(p—€p, pte») < 
e. Then C = {(p— €),p + €)}pex is an open cover of Kk, so by the Lebesgue Number 
Lemma we can find 6 > 0 so that if J is an interval and 1N K 4 0 and |J| < 6 then 
IC (p—&,p+e,) for some p € Kk, which means that 0Q;I < e. 


Theorem 7.70. Let S be a set and € > 0 such that for every countable {I;}ien of 


(oe) 


open intervals which cover S, S- [I;| >. If {Ui bien is any countable collection of 


i=1 
oo 


intervals which covers S' then ye |U;| > €. 
i=1 


Proof. Suppose that there are intervals {U;};¢n which cover S' so that SS Cyl Sse, 


i=l 
where the left and right end points of U; are a; and 6; respectively. Then define 


€e—T e—-r 
V, = (a; — pina Oi oe ), and {V;}ien is a countable collection of open intervals 


= e—T 
which covers S so that S- \ViJ=r+ ys ne contradiction. 
i=l 


200 CHAPTER 7. SUPPLEMENTARY MATERIALS FOR ONE VARIABLE 


Theorem 7.71. Let E be a set andy > 0 so that for any sequence of intervals {I;} 
which covers E, S- [I;] >. Let C = {I;} be such a cover. Then SS \Ii| > +. 
i=1 iEN|ISNEAO 


Proof. Suppose that S° \[;| =a <y. Then all of the points of E not covered 
{il PNEZO} 
by C= {1,|I° OE # @} are end points a;,b; of the J; intervals. Thus, if we set 


BS (fai, bi} then A(B) = 0 since B i countable, and so we can find a countable 


i=l 
collection of intervals D = {R;}icen which cover B so that , |Ri| << y—a. Hence, 


i=1 
the set of all intervals in CU D is a countable collection of intervals which cover E, 
the sum of whose volumes is less than y, a contradiction. 


Theorem 7.72. Let P = {%1,%2,...,v5} be a partition of [a, bj. 
het = tla By Cpa ag ate Ne LO Se oe ee 
and let kK = Uc. Let D = {Ih, Io,..., Im} be a cover of K by open intervals I; = 


m t 
(a;,b;). Then S- \I;| > So en, — Bina 
i=l i=l 


Proof. Since K is a finite union of closed sets K is closed, and since K C [a,b], K is 
bounded, so K is compact by the Heine-Borel Theorem. 
By the Lebesgue Number Lemma we can find 6 > 0 so that if J is a closed interval 
of length less than 6 and J intersects K then J is a subset of some I; € D. 
Next, we choose a partition Q of [a,b] that refines P and has mesh less than 4, 
t t 


where Q = {qo, G1; G2, +++, dw}. Then Se Pas — S- S- Gj —Qj-1 


i=1 t=1 {JEN|[95-1,95]Clen;_7 nr, J} 


<So S- Qj — V-1 =>" i. 
t=1 {JEN|[qj-1,4;]CLi} i=l 
The reason the last inequality is strict is that, for each 7, if p is the least integer 
so that [dp, q+] C 4G and r is the last integer so that [q-, q-41] C J; then a; < p and 


r <b; which means 6; — a; > dr+1 — UG = So ditt -—G = > qj — Qj-1- 
i=p {JEN [g5-1,4j;] CL} 


t 
We did not list justifications for the other parts of the expression Ss" a 


i=1 
t 


2 Ss" = Gas S- Ss" GCS G2= S° |I;| since they 


t=1 {7EN|[4j—-1,9j]Clen;_1tn,]} #1 {JEN|[qj-1,9,]CL} 1 


7.8. MORE ON INTEGRATION 201 


are similar to things we have observed earlier, but to clarify further, since Q refines 

P, for a given interval [r,, ,,2n,| there is some k so that q, = %p,_, and some r so 

that ¢-41 = %p,, which tells us that S- Q3 — 4-1 = (Ge+1 — M%) + 
{GEN| [95 -1,95]Claenj;_ 1 en;]} 

(de+2 — Qeei) + -- + (Grti — Gr) = G41 — Wk = Ln; — Ln,_,. This justifies the first 

equality. 

For the first inequality, we notice that each q;,1 — q; in the preceding sum is 
added once (since [q;41 — g;| cannot be a subset of different intervals [x,,_,,@n,]). 
Each such [9;41 — qj] is a subset of some J; by the Lebesgue Number Lemma since 
G+ — a < 6 and [gj41—g)) 1K # @. Thus, every term qj;41 — g; in the sum 

t 


S- S- q;—qj-1 iS asummand in the sum Ss" » OQ O54 
t=1 {7EN|[aj—-1,9;]Clen;_1,2n,]} t=1 {JEN [aj—-1.9,]CLi} 

It is possible that such a term may appear more than once in the second sum, since 
it is possible for [¢g; — g;-1] to be a subset of more than one J;. It is also possible 
that there are intervals [q; — q;-1| contained in an element of D which do not appear 
in the left sum. Either of these two possibilities would result in a larger sum on the 
right. Thus, the inequality follows. 


The following theorem is one of the most helpful results for deciding when a 
function is Riemann integrable. 


Theorem 7.73. Lebesgue Characterization of Riemann Integrability. Let f : |a,b] > 
R be bounded. Then f is integrable if and only if the set E = {x € [a,6]|f ts not 
continuous at x} has Lebesgue measure zero. 


Proof. Since f is bounded we can choose M > 0 so that |f(x)| < M for all x € fa, 0]. 
1 [oe) 
For each n € N let E,, = {x € [a, b]|ws(x) > nh Note that EF = U E,. If AXE) =0 


n=1 


then A(E,,) = 0 for each n € N by Theorem 7.60. 
Assume that f is integrable. Suppose that A(£) 4 0. Then for some m € N 
we know from Theorem 7.62 that A(F,) # 0, so there is a number 7 > 0 so that 


if {J;}ien is an open cover of E,, then S- (| = het P= 495 83, oe eh be a 
i=1 
partition of [a,b]. Then U(f, P)-—L(f, P) = S- (M; —m,) (a; — 24-1) + 
{iEN|(xi-1,01)NEm ZO} ; 
(M; — m;)(x; — 4-1). By Theorem 7.65 we know M; — m; > — if 
m 
{iEN|(xi-1,01)NEm=0} 


(24-1, 0;) 1 Em 4 0, so we know that ye (M; — m;)(x; — 24-1) > ale 
{iN|(@i_-1,21)NEm #0} i 
since ys (x; — 4-1) > y by Theorem 7.71, and thus f is not integrable, 


{ieEN| (a@j-1 04) Em #0} 
a contradiction. 


202 CHAPTER 7. SUPPLEMENTARY MATERIALS FOR ONE VARIABLE 


€ 
—. Choose a 


i) 


pe 
Assume that \(£) = 0. Let € > 0. Choose 7 € N so that = < 


(oe) 


countable cover of EF; by open intervals {J;};en so that S- [Ii] < a Since E; is 
oar 4M 
compact by Theorem 7.67, we can find a finite subcover F = {In,,Ing,.--,In,}. Let 
t 


1 
i= |G9 \U [,,.. We know K is compact by the Heine-Borel Theorem, and w(p) < j 
i=1 

for all p € K, so by by Theorem 7.68 we can find a number 6 > 0 so that if J is an 
1 
interval intersecting AK with |J| <6 then Qyl < -. 
Jj 

Let P = {2 1%2,...,%;} be a partition of [a,b] with |P| < 6. Then U(f,P) — 

L(f, P) = », (Mi —m;)(%j—xi-1) + De (Mi —m;)(2j—2;-1). 


{iEN|[zi-1,0;,]NK AG} {iEN|[a2i-1,0;,|NK=0} 
1 
Since the mesh of P is less than 6 we know that M; —m; < — if [r;_1,7;] K #0, so 
J 
b-—a 


(M; — m;)(x; — %-1) < 
{iEN|[x;~1,2,]INKAO} 
By Theorem 7.72, we know that | » (x; -— 2-1) < at since F’ covers 
{iEN|[zi-1, 0, ]NK=0} 
the union of all intervals [7;_1, 2;] which do not intersect K, so S- (M; - 
{iEN][zj-1,2;)INK=0} 


(2M) = = Hence, U(f, P) — L(f, P) < € and f is integrable. 


€ 
mi) (x; _ Li-1) < 4M 


Note: The name ” Lebesgue Characterization of Riemann Integrability” is descriptive, 
and a name we may refer to, but it does not appear to be an official name for 
the preceding theorem that is normally used in the literature. It is referred to as 
” Lebesgue Criterion of Riemann Integrability” or the ” Riemann-Lebesgue Theorem,” 
but there does not appear to be a consistently used name for the theorem that is 
universally preferred. 


Wallis’s Formula: 


Wallis’s Formula is very much optional, but it makes some common definite 
integrals quick to evaluate and will simply our work in some later examples and 
exercises. It uses a reduction formula, so we will derive some of those first, using 
integration by parts. 


Theorem 7.74. Trigonometric integral reduction formulas. Let n,m be positive 
integer powers greater than or equal to two. Then: 


(1) [scot(eyaz oe: ; sec” ?(x) tan(x) + Tae f seo *(a)dz 


7.8. MORE ON INTEGRATION 203 


(2) [est(eyax alae esc” (x) cot(x) + a [oo war 


(3) i mnine cos" (2) sande ns i cos"-?(x)dax 


(5) [ sin"(2) cos (x)de = - 


(6) [ew cos” (x)dx = — 


Mart 
1 
(7) [ex waz = ;, tan” (2) _ few *@ade 
1 

(8) [cota ae cot” *(z) — [cot (eae 
Proof. (1) We use integration by parts, with the part to be differentiated sec” ?() 
and the factor to be integrated sec?(z). This gives us: 

sec"(x)dx = sec” *(x) tan(x)— | (n—2) sec” *(x) sec(x) tan(x) tan(x)dx. Then, 

recalling that tan?(2) = sec?(x) — 1 in the last integral, this becomes: 


sec"(x)dxz = sec’ *(x) tan(x) — | (n — 2) sec”~?(x)(sec?(x) — 1)dz = 
[seor(eyaz = sec”?(x) tan(x) + (n — 2) [soar —(n—2) [scor(eyax 
[scor(eyaz + (n— 2) [secr(eyaz = sec” *(xr) tan(x) + (n — 2) [sowie 


so (n — 1) [secrete = sec” *(x) tan(x) + (n — 2) [sco @ae 


1 —2 
A sec” *(a) tan(x) + — [sec"*(@ae. 
(2) Using parts again with the integrated factor dv = csc*(x) and the differentiated 


factor u = csc” *(x) we obtain: 


and hence | sec"(x)dx = 


ese"(x)dx = — ese"2(x) cot(x)— / (n—2) ese"-3(x)(— ese(x) cot(x))(— cot(x))de. 
Then, recalling that cot?(x) = csc?(x) — 1 in the last integral, this becomes: 
/ ese" (x)dx = — ese") cot(«) — / (n — 2) sec"() (ese?(w) — 1)dx so 
ese" (x)dx = — ese") cot(x) + (n — 2) f: esc”?(x)de — (n — 2) if ese" (x)dar 
(n—1) | ese"(w)dx = —esc"-?(x) cot(x) + (n — 2) is ese" 2(x) dex 
‘ ese" (n)dr = * ese *(2) eot(r) + 2 i: esc-?(a)da 


(3) Use parts again, setting wu = cos" '(x) and dv = cos(x) which gives: 


"—1(¢) sin(x) — | (n — 1) cos"?(x)(— sin(x))(sin(x))dx 


cos"(a)dx = cos 


Using the identity sin?(x) = 1 — cos(a) gives: 


204 CHAPTER 7. SUPPLEMENTARY MATERIALS FOR ONE VARIABLE 


[cos ayaa = cos" z)sin(x) + f (n= 1) c0s"-*(e)(1 ~ cos?(z)) dz, 
/ cos"(x)dx = cos"“!(x) sin(x) + (n — 1) / cos"-?(x)dx — (n — 1) i cos”(2))dz, 


(n—1) [cost a)aa ~- | cos"(x)dx = cos"~*(zx) sin(x)dx + (n — 1) [cos (aha, 


so nf cos"(2)da = cos*!(z) sin(x) + (n — 1) [cos wae Thus, 


[ cos") = ~ cos""(2) sin(x) + ns [cos *(a)az 

(4) Use parts, setting u = sin"~'(2) and dv = sin(x) which gives: 

[sw waz = sin”! (x)(— cos(x)) — Jo — 1)sin"?(x)(cos(x))(— cos(x))dz. 
Using the identity cos?(x) = 1 — sin?() on the last integral gives: 

[sw waz = —sin”'(x) cos(x) + Jo — 1)sin”-?(x)(1 — sin?(x))dz, so 

[sw @az = —sin” '(x) cos(x) + (n — 1) [sw eae —(n-1) [smo 


ni ‘i inde au ie) if din? 2b This, 


n—-1 


> nl 


1 
sin” (x)dx = 7 sin (x) cos(a) + sin” °*(x)dx 


(5) Use parts directly with the factor to be integrated being sin"(2) cos(a) and the 


factor to be differentiated being cos™ '(x). A single use of parts gives us | sin"(a) cos” (x)dx = 


—1 
mei cos™—!(z) sin"t! (x) + a / sin”*? (x) cos” ?(a)dzx as desired. 
(6) Use parts again, with the factor to be integrated being cos’”’(x) sin(a) and the 


factor to be differentiated being sin"! (a). Using parts gives us [ sin"(x) cos” (x)dx = 


1 1 
ie i [sew cos™t?(x)\dz. 


(7) The last two formulas don’t require parts to derive. We just use the formulas 
tan?(x) = sec?(x) — 1 and cot?(x) = csc?(x) — 1) as follows: 


/ tan”(x)dx = : tan” °(x) tan?(x)dx = / tan” *(x) sec?(x)dx — / tan” *(x)dz. 


sin”~"(x) cos™*" (x) + a 


Setting u = tan(x), du = sec*(x) for the first integral, giving: 
frm waz = pera [er @ae, so 
tan’ “() 


-1 
(8) The last formula is derived similarly. 


[cota = foo ea) cot?(x)dr = [oor ee) esc? (x)de — [oot (oat. 
Setting u = cot(x), du = — csc?(x) for the first integral, giving: 


cot" (ajdx.= - [wa — foot” oar. so 


tan” (ede tan” *(x)dx 


7.8. MORE ON INTEGRATION 205 


cots te) 


[cota = ——___ — [cot (waz 


n—1l 


Theorem 7.75. Wallis’s Formula. Let m,n be non-negative integers. Then the 


integrat f° dune cos v= (n — 1)(n — 3)(n —5)...(1)(m — 1)(m — 3)(m — 5)...(1) 
0 (n+m)(n +m — 2)(n+m — 4)...(1) 
times . if both n and m are even. Furthermore, for any integers k < 7, if the number 

7 


of intervals of the form (is, (i+ 1)5) fork <i <j for which sin"(x)cos™(x) is 


positive on (i, (¢+ 1)5) is P and the number of intervals of the form (iS, (a+ 15) 
fork <i < j for which sin"(x) cos™(x) is negative on GG + i=) is N then 


2 2 


in 


qe sin"(x) cos” (x)dr = (P — N) E sin”(x) cos” (x)dex. 


us 


Proof. We use Theorem 7.74 part (5) on the integral | ° sin”(x) cos" (x)dx we get 
0 
2 m-1 


5 
sin”*?(x) cos” ?(x)dax 
+ DES [snr te) cosa 


1 2 
;f sin”? (x) cos” ?(x)dx, assuming that 


Tw 


sin”*1 (x) 


n+1 


i * sah) col (de =: wos 1) 


m — 


which is just equal to the integral 


m — 1 is positive. Using (5) again on this integral with n + 2 and m — 2 as the new 


ee a 
alee aih | sin”? (x) cos” ?(x)dz, 
0 


i that the int li lt 
powers gives us that the integral is equal to ar eer 


assuming that m— 3 is still positive. We iterate this process until the power of cosine 
m 
is either zero or one. In the case where m is even we are able to repeat this process zs 
=1)(m—8)...( 2 
ici ici EE | sin?" dee It 
(n+m-—1)(n+m-— 3)...(n+3)(n+1) Jo 


times, leaving us with 


times, leaving us with 


m is odd then we are only able to perform this reduction uk 
(m — 1)(m — 3)...(1) 
(n+m-—3)...n+3)(n+1 


ih sin"t™~?(x) cos(x)dx. Then, setting u = sin(x) and 
0 
1 
1 
du = cos(x) the last integral becomes | unt 2dy = ———_. Multiplying this 
0 mpm) 
(m — 1)(m — 3)...(1) 


(n+m—1)(n+m-—3)...(n+3)(n +1) as desired. 


by the preceding integral gives 


2 
Assuming that m is even, we continue, using formula (4), to give | sin” (adr = 
0 


1 2 =A 
— sin”*™—1 (x) cos(x) ER f sintt%(e)ae Assuming that n+m—1 
n+m 0 n+m 
-1 
is positive this becomes Louie sin"t” °(x)dx. In the case where n + m is 


n+m 


206 CHAPTER 7. SUPPLEMENTARY MATERIALS FOR ONE VARIABLE 


+m 


n 
even we are able to repeat this process times, finally arriving at the integral 


-1 —3)..(1) f? 

eee dere a) [ ldx. Thus, the original integral if both n,m are 
(n4+m)(n+m—2)...(1) Jo 

even 1s 


(m — 1)(m — 3)...(1) (n4+m-—1)(n+m-—83)...1) a 
(nt+m—I1)(n+m-—83)..(n+3)\(n+1) (n+m)(n+m-—2)...(1) 2 
of the first fraction cancels with all terms down to the (n —1) factor in the numerator 


Tw 


. The denoiminator 


of the first fraction, leaving us with Wallis’s formula, iF ; sin"(a) cos™(x)dx = 
0 


(n — 1)(n — 3)(n — 5)...(1)(m — 1)(m — 3)(m — 5)...(1) a 
(n+m)(n +m — 2)(n+m — 4)...(1) 2 


, as desired. 


npn 
In the event that m is even but n is odd, we can only repeat this process 


2 —1 —3)...(1) f? 
times, leaving us with sin”? (2). = wae aa f sin(x)dz. 


pt 


Since i sin(x)dx = 1, we get that [ sin”(x) cos™(x)dx = 
0 0 


(m — 1)(m — 8)...(1) (n+m—1)(n+m -— 8)...(1) 
(n+ m—1)(n+m-—83)..(n+3)\(n4+1) (n+m)(n4+m -— 2)...(1) 
(n — 1)(n — 3)(n — 5)...(1)(m — 1)(m — 3)(m — 5)...(1) 
(n+m)(n +m —2)(n+m — 4)...(1) 
The last part of the theorem follows from a substitution and a trigonometric 


(1) = 


, as desired. 


T 
identity. We focus on integrating over a particular integer multiple of 5 to its 
ie 
7 
immediate successor times 5 first, integrating | sin”(x) cos™(x)dx. We first 


us 


make the substitution u = x — = Then du = dz and © = ut ~ The integral 


T 


then becomes ie sin”(u + *) cos” (u + du Using the sine and cosine sum of 
angles formulas, we see that sin(u + *) = sin(u) cos( ) + cos(u) sin(“), and 
cos(u + =) = cos(u) cos() — sin(u) sin(“). If 7 is odd then these simplify to 
sin(u + ) = +cos(u) and cos(u + ) = +sin(u). If 7 is even then they simplify 


to sin(u + =) = +sin(u) and cos(u + =) = +cos(u). Hence, in all possible cases 


the integrand is +sin"(u)cos’”(u) or +sin™(u)cos”(u). By Wallis’s formula (over 
(i+1)0 T 
Gre 


(0, 4) this tells us that [| sin"(x) cos™(x)dx = «| sin”(a) cos™(x)dx, where 
in 0 


2 
the integral will be positive if sin”(x) cos” (x) > 0 on (iS, (a+ 1)5) and negative if 


7 


sin”(x) cos™(x) <0 on (iS, (¢+ VS 


7.8. MORE ON INTEGRATION 207 


ix 
2 


If we integrate | sin”(x) cos™(ax)dx for integers k < j then we can separate 


kr 
2s 
T 
the integral into the sum of the integrals over the P intervals (5 


Tw 


(4+ 5) where 


2 
sin"(x) cos™(a) > 0 and k <i < j, which equals Ry sin”(x) cos™(a)dxz, minus the 
0 
sum of the integrals over the N intervals (iS, (i+ 15) where sin”(x) cos” (x) > 0 
and k <i < j, which equals vf sin"(x) cos™(ax)dx, so | sin’ (x) cos’ (ae = 
0 tg 


a 


(P—N) i sin”(x) cos™(ax)dx. 


208 CHAPTER 7. SUPPLEMENTARY MATERIALS FOR ONE VARIABLE 


7.9 Exercises for Supplementary Materials for One Variable 


Exercise 7.1. Letm,neéEN. Then either mn €N orm» is irrational. 


Exercise 7.2. Euclidean Algorithm (main premise). Let a,b be positive integers so 
that a < b and q 1s the largest integer so that b — aq > 0. Then the greatest common 
divisor of a and b is the same as the greatest common divisor of a and b — aq. 


, m 8s 

Exercise 7.3. Let m,n be natural numbers then there are integers s,t so that — = ‘ 
n 

so that the prime factor decompositions of s and t share any prime factors (such a 

fraction r is said to be written in reduced terms). If p is a prime factor of m and is 


m 

not a prime factor of n and — is an integer then p is a prime factor of the integer 
n 

m 


n 


Exercise 7.4. Fermat’s Little Theorem. Let p be prime. Then a? —a is divisible by 
p for every natural number a. 


Exercise 7.5. If A and B are countable sets then A x B is countable. 


Exercise 7.6. Leta <b. Then there is an irrational number r so thata <r < b. 


Exercise 7.7. Let f: E > R. Then f is continuous if and only if for every closed 
set ACR, the set f~'(A) is closed in E. 


Exercise 7.8. Let S be uncountable. Then S has a limit point. 


Exercise 7.9. Let f : [a,b] + R be integrable. Let g : [a,b] — R and let F = 
{X1,L2,...,Ln} be a finite subset of [a,b], where g(x) = f(x) for all x € [a,b] \ F. 


b b 
Then inf g =, f: 


7.9. EXERCISES FOR SUPPLEMENTARY MATERIALS FOR ONE VARIABLE — 209 


Hints: 


Hint to Exercise 7.1. Let m,n eéEN. Then either mx €N or m* is irrational. 


1 
Suppose by way of contradiction that Be mn, where p € N and q € N and E is in 
qd qd 


reduced terms with gq > 1. Explain why it is impossible for p” to be an integer multiple of 
Tm 


qd. 


Hint to Exercise 7.2. Euclidean Algorithm (main premise). Let a,b be positive integers 
so that a < b and q is the largest integer so that b—aq > 0. Then the greatest common 
divisor of a and b is the same as the greatest common divisor of a and b — aq. 


First, assume that D is a common divisor of a and b— aq. Show that D is also a divisor 
of 6 using the fact that a = mD and b— aq = nD for some integers n,m. Then show that 
any common divisor of a and 6 is also a divisor of a and b — aq. 


Hint to Exercise 7.3. Let m,n be natural numbers then there are integers s,t so that 
m ss he: 

ee that the prime factor decompositions of s and t share any prime factors (such a 
n 


8 
fraction a is said to be written in reduced terms). If p is a prime factor of m and is not a 


prime factor of n and — is an integer then p is a prime factor of the integer —. 
n n 


Use the Fundamental Theorem of Arithmetic. 


Hint to Exercise 7.4. Fermat’s Little Theorem. Let p be prime. Then a? — a is divisible 
by p for every natural number a. 


Use induction and the definition of divisibility. Recall that what it means for a?—a to be 
divisible by p is that a? — a = mp for some natural number m. Use the Binomial Theorem 
and explain why, for a prime number p, it is true that the factor p in the numerator of 

! 
pal does not cancel with any factor of the denominator, assuming 1 <i <p—1. You 
i!(p — i)! 
may wish to use the Fundamental Theorem of Arithmetic for that. 


Hint to Exercise 7.5. If A and B are countable sets then A x B is countable. 


Use the fact that the union of countably many countable sets is countable. 


Hint to Exercise 7.6. Leta < b. Then there is an irrational number r so thata <r < b. 


If all the points between a and 6 are rational, then explain why the set of such points is 
countable (which contradicts a theorem). 


Hint to Exercise 7.7. Let f: EH > R. Then f is continuous if and only if for every closed 
set ACR, the set f—1(A) is closed in E. 


210 CHAPTER 7. SUPPLEMENTARY MATERIALS FOR ONE VARIABLE 


Try using Theorem 7.45. 


Hint to Exercise 7.8. Let S be uncountable. Then S has a limit point. 


Use the fact that the countable union of countable sets is countable to show that a 
bounded interval contains uncountably many points of S. 


Hint to Exercise 7.9. Let f : [a,b] > R be integrable. Let g : [a,b] + R and let 
F = {21,22,...,%n} be a finite subset of [a,b], where g(x) = f(x) for all x € [a,b] \ F. Then 


fof 


Use the Lebesgue Characterization of Riemann Integrability to show that g is integrable 
and then use the characterization of integral in terms of a sequence of Riemann sums to 
show the integrals are equal. 


7.9. EXERCISES FOR SUPPLEMENTARY MATERIALS FOR ONE VARIABLE 211 


Solutions: 


Solution to Exercise 7.1. Let m,n € N. Then either mx €N or m* is irrational. 


Proof. Suppose that ee mr, where p € N and q € N and z is in reduced terms with g > 1. 
qd qd 


1 
P . : 
Then — =m. Since p and g have no common prime factors, p” and q” have no common 


prime factors by the Fundamental Theorem of Arithmetic. This means that p” cannot be 
a multiple of g” and in particular p” is not equal to m(q”). This is a contradiction. 


Solution to Exercise 7.2. Euclidean Algorithm (main premise). Let a,b be positive 
integers so that a < b and q is the largest integer so that b— aq > 0. Then the greatest 
common divisor of a and b is the same as the greatest common divisor of a and b— aq. 


Proof. Let d € N. First, assume that d divides a and b. Then for some integers m,n it is 
true that a = md and b = nd, which means that b— aq = d(n— mq) and therefore d divides 
b— aq. 

Suppose that d divides both a and b — aq. Then for some integers n,m we know that 
b— aq = dm and a = dn. Thus, b = d(m-+ nq) which means that d divides both a and b. 
Since the divisors of a and 6 are the same as the divisors of b — aq it follows that both have 
the same greatest common divisor. 


Solution to Exercise 7.3. Let m,n be natural numbers then there are integers s,t so that 
ms oe 

em ss that the prime factor decompositions of s and t share any prime factors (such a 
n 


8 
fraction : is said to be written in reduced terms). If p is a prime factor of m and is not a 
prime factor of n and — is an integer then p is a prime factor of the integer —. 
n n 

Proof. Let m = pT py 2-2,” and n = qj''q5?...q," be the prime decompositions of m 
and n. Let pji,,...,p;, be the primes in the decomposition of m that do not appear in 
the decomposition for n, and let qj,,qj., --., qj, be the prime numbers in the decomposition 
of n that do not appear in the decomposition for m. Let pj,,,,Pi.y2,---»Pi; be the prime 
numbers which appear in both decompositions. Then if v;,,. is the power of p;,,,, in the 
numerator minus the power of the same prime in the decomposition of the denominator, we 


m Vi Vi Vi, : : 7 
have — = (Diy--Pis (Di ST Piggy 9 Pig. Vag a, )- This can be written as a fraction 
. J1 J2 => Jt . * . . 
where, depending on whether the net power v;.,,. of each shared prime term is positive or 
? s+r 


s+r 


+r 
—p.at in the factorization in the numerator (if the exponent is negative). There are then 


no prime numbers common to the factorizations of both numerator and denominator. 


negative, we put Pi, in the factorization of the numerator (if the exponent is positive) or 


m 
Note that if — is an integer then if p is one of the prime numbers in the factorization 


of m which does not occur in the prime factorization of n, the reduced form described 


212 CHAPTER 7. SUPPLEMENTARY MATERIALS FOR ONE VARIABLE 


above could only be an integer if all primes q; canceled with corresponding primes in the 
numerator, and the prime factorization of the resulting fraction in reduced terms would still 
. ; ‘ m 
have p (in the numerator), so it would follow that p is a factor of —. 
n 


Solution to Exercise 7.4. Fermat’s Little Theorem. Let p be prime. Then a? — a is 
divisible by p for every natural number a. 


Proof. We proceed by induction on a. First, note that 1?—1 = 0 is divisible by every number, 
including p. Assume that k? —k = mp for some natural number m. Then (k+1)?—(k+1) = 


P p-l 
-k-14+5- ("x =k -k+)~ Gz Note that if 1 <i <p—1then @ Sipe a, 

mera. al i (p — a)!4! 
where p divides the numerator, but since no prime factor of the denominator is p or larger, 
p does not divide the denominator since p is prime. To see this, use the Fundamental 
Theorem of Arithmetic to factor the each of the integers in the list 1,2 — 1,7 — 2,...,1 
into their prime decompositions (where each prime in the decomposition of each number 
is less than p since a number in a decomposition cannot exceed the number it is in the 
decomposition for) and multiply them together to get a product of prime numbers equal to 
(i—1)!. Similarly, none of p—i,p—i-—1,...,1 has p as a factor in its prime decomposition. 
Thus, the prime decomposition of (p — 7)!i! does not contain any prime factor equal to (or 
larger than) p. Hence, the factor of p in p! does not cancel with any of the prime numbers 


! 
in the denominator of the fraction eat which means that (*) is divisible by p (as 
p—iylil i 
addressed in the preceding exercise). Thus, there are natural numbers m1, mz, ...,Mp—1 sO 
p! i 
(p — i)\! 
which means that (k + 1)? — (k +1) is divisible by p. The result then follows by induction. 


that (A + 1)? — (k+1) = mp + mip + mop + ... + Mp_-1p where each mp = 


Solution to Exercise 7.5. If A and B are countable sets then A x B is countable. 


Proof. If A or B is empty then A x B is empty and therefore countable. Assume A and B 

are not empty. By Theorem 7.19, since A is countable there is an onto function h: N > A, 

which means that we can list A = {a1, a2, a3,...}. Similarly, we can write B = {by, bo, b3,...}. 
[oe 


By definition of Cartesian product, we have A x B= {ai} x {b1, ba, b3,...}. For each 


i=1 
i € N, there is an onto function f : N > {a;} x {b1, be, b3,...} — B defined by f(j) = (7, h(y)) 
for each 7 € N. Hence, {a;} x {b1, bo, b3,...} is countable for each i € N. Thus, Ax Bisa 
countable union of countable sets and is therefore countable. 


Solution to Exercise 7.6. Let a < b. Then there is an irrational number r so that 
a<r<b. 


7.9. EXERCISES FOR SUPPLEMENTARY MATERIALS FOR ONE VARIABLE 213 


Proof. Since we have shown that each set containing an open interval is uncountable, we 
know that (a,b) is uncountable. Since Q/N (a,b) C Q and Q is countable, we know that 
QN (a,b) is countable since subset of a countable set is countable, so there are irrational 
numbers in (a, b). 


Solution to Exercise 7.7. Let f: HE > R. Then f is continuous if and only if for every 
closed set ACR, the set f~1(A) is closed in E. 


Proof. Assume f is continuous and let A be closed. By Theorem 7.45, we know that 
f~1(R \ A) is open in E, which means that E \ f~'(R \ A) = f7'(A) is closed in E. 
Assume for every closed set A C R, the set f —1(A) is closed in E. Let U be open. Then 
f-l(U) = E\ f71(R \ A), which is the complement of a closed set in E, which means that 
ft) is open in F. Thus, by Theorem 7.45, we know that f is continuous. 


Solution to Exercise 7.8. Let S be uncountable. Then S has a limit point. 


Proof. If each E; = {[t,¢ +1] S}, where i € Z, is countable, then by Theorem 7.23, it 

follows that S = U E;, is countable (which is impossible). Hence, for some integer m we 
1EZ 

know that E,, = [m,m+1]NS is uncountable and therefore infinite, which means that 

Em has a limit point p by Exercise 3.18, and therefore p is a limit point of S by Exercise 

3.14. 


Solution to Exercise 7.9. Let f : [a,b] > R be integrable. Let g : [a,b] > R and let F be 
b b 
a finite subset of [a,b], where g(x) = f(x) for all x € [a,b] \ F. Then i oS | FZ 


Proof. Let Dy = {x € |a,b]|f is not continuous at x}. Dg = {x € [a, b]|g is not continuous 
at x}. Let p © [a,b] \ F and let f be continuous at p. Since F is finite, it is a finite union 
of (trivial) closed intervals and therefore a finite union of closed sets, which is closed. Let 
€ > 0. Choose 6; > 0 so that if |p — x| < 6; and x € [a,}] then |f(p) — f(x)| < €. Since 
F is closed, we can find 62 so that if (p — 62,p + 62) 1 F = 0. If 6 = min(61, 62) then if 
|x — p| < 6 and x € [a,b] then |g(x) — g(p)| = |f(x) — f(p)| < €, so g is continuous at p. 
Hence D, C Dy UF. Since f is integrable, we know that A(Dr) = 0. Since F is finite, 
A(F) = 0. Hence, by Theorem 7.62, we know that \(Dy U F’) = 0, so by Theorem 7.60, we 
know that A(D,) = 0, so g is integrable by Theorem 7.73. 

Choose a sequence of partitions {P,,} of [a,b] so that {|P,|}— 0. Since F is finite, for 
each P, we can choose a marking T,, so that T, 1 F = 0. Thus, Sz,(f, Pn) = Sr, (9, Pr) 


b 
for all n € N. Hence, by Theorem 6.7, we know that {S7,(f, Pn)} > ‘s f and therefore 


{Sr(oPadb > fF andl nenee + fs fs 


Chapter 8 


Series 


Definition 57 


n 


If {x,,} is a sequence then the nth partial sum of this sequence is 5, = SD x;. The 


i 
oo oo 


sequence of partial sums {s,,} is the series >» Z,. We also use DS Zp, to refer to the 


msi n=1 
point to which this series converges, depending on context. 


oe) 


Theorem 8.1. Cauchy Convergence Criterion. The series S° Ln converges if and only if 


n=1 
n 
for every € > 0 there is an integer k so that ifn > m > k then | S> | ee 
i=m+1 

n 
Proof. We know that the sequence of partial sums {s,,} = io x;} converges if and only if 

i=1 
it is a Cauchy sequence by Theorem 3.25, which is true if and only if for every « > 0 there 

n 


is an integer k so that ifn >m>k then |sp, — 5 | = | .s |< € 
i=m41 


Theorem 8.2. Let k be a non-negative integer and c # 0 and let {x,} be a sequence. 


(oe) (oe) [o-e) 
Then s Ln converges if and only if oS CXLy converges. Furthermore, if ye = [ then 
n=1 n=k+1 n=1 


love) k 
) Cin = cL — ) CXLn- 
n=l 


n=k+1 


Proof. Let {s,} be the sequence of partial sums of the sequence {,,} and note that {sp4%— 
co 


sb = Ss Xn. Then {s,} — L if and only if {sn4,} —~ L by Theorem 3.26, and {sni~} > 
n=k+1 


214 


215 


Lif and only if {s,4,—s,} + L—s,. Thus, {s,} > Lifany only if {c(s,4,—5,)} > c(L—sp) 
k 


oe) 


by the product rule for sequence limits. In other words, a CLyn = CL — Ss" CL. 


n=k+1 n=1 
[o-e) 
Theorem 8.3. Divergence Test. ioe Ln converges then {x,} — 0. 
n=1 


n 
Proof. Let sn = So ai be a sequence converging to the point s.. Then {8,4} is a 
i=1 
subsequence of {s,,} converging to 84, by Theorem 3.11. Hence the sequence {$41 — $n} = 
{tn41} > 800 — S00 = 0, so {r,} > 0. 


Theorem 8.4. Comparison Test. Let {a,} and {b,} be sequences of non- negative terms 
(oe) 


80 that ay < by, for alln EN. ify by converges then s Gn converges. Ms diverges 


n=1 n=1 n=1 


CO 
then S- by diverges. 


n= 


Proof. Let A, = Ya and let B, = Se Since the terms of {a,,} and {b,} are non- 


7— V1 
negative, it follows fine {A,,} and {B,,} are non-decreasing sequences, and by The Monotone 
(oe) 


Convergence Theorem they converge if and only if they are bounded above. Thus, if S- On 
n=1 


converges then {B,,} is bounded above, so {A,,} is bounded above since A, < B,, for each 
[o-e) [o-e) (oe) 


n€N, so S- Gy, converges. Hence, if sS Gy diverges then Se b, diverges. 


n=1 n=1 n=1 


The comparison test can be thought of as the strongest test for verifying the convergence 
of series sums of non-negative terms in the sense that every convergent series consisting of 
only positive terms is less than some other convergent series, so if you could find the right 
(larger) series to compare to and show this larger series is convergent then you could always 
show the smaller series is convergent. In practice, however, this is not always reasonable. 
Frequently, a good series to compare to is a p-series, addressed in the exercises. 


(oe) 


5 
Example 8.1. Determine whether SY + cos(n) 
n 


a pinGas re ene 


216 CHAPTER 8. SERIES 


Solution. Since 5 + cos(n) < 6 and n? + In(n + 4) > n° for all n € N it follows that 


5 00 y 
— — 5 for all positive integers n. Since we know that the p-series » a 
5 +cos(n) 7 
converges, it follows from Theorem 8.2 that =o —z converges, so Ss 7 +In(n +4) converges 


n= i 


by the Comparison Test. 


Theorem 8.5. Limit Comparison Test. Let {ayn} and {b,} be sequences of non-negative 


terms so that Jim aa =L>0. Then 3 Gn converges if and only if 2 by converges. 


n=1 n=1 


nm n 
Proof. Let An = S° a;, and let B, = S° b;. We can choose a positive integer N so that 
i=1 i=1 


L L a 3L byl 
ifn > N then ee — L| < =, and hence << , which means that “— < 
bn 2 2 bn 2 
[oe] 
An , it follows that if > b, converges then 
n=N-+1 
 3Lb ~ 
> 5 “converges, and so Ss" dy converges by the comparison test. Similarly, if 
n=N-+1 n=N-+1 
[oe] co b, L [o-e) [o-e) 
3S Gp, converges then oS a converges and so > by, converges. Hence, a, 
n=N-+1 n=N-+1 n=N-+1 n=1 
[oe] 
converges if and only if S by, converges. 
n=1 


Theorem 8. 20: Geometric Series Convergence. Let On = =ar"—! for each positive integer n. 


rr 
a 
Then 8, = Soa 1a") and if |r| <1 then Svan = 
n=1 
a(1—r” 
Proof. Note that rs, = 5 ae SO Sn — TS, = a — ar”, 80 S, = yor = — If 
i=1 i=1 es 
(oe) 
F n toy, l=") _ 2 
|r| <1 then lim r =0, so Soar = lim, = Ses a 


w=1 


Definition 58 


b 


We define 7 f= jim wf if this limit exists. 


b—00 


217 


Theorem 8.7. Let F : |a,oo) be a monotone function. Then lim F(x) = L exists if 
vw CO 


and only if F is bounded. If F is non-decreasing then L is the least upper bound of the 

range of F. If F is decreasing then L is the greatest lower bound of the range of F. If 

F is not bounded and is non-decreasing then lim F(x) = oo. If F is not bounded and is 
Ho [oe 


non-increasing then lim F(x) = —oo. 
«LOO 


Proof. We first assume that F is non-decreasing. If F' is not bounded then it is not bounded 
above (the range of F' is bounded below by F(a), let wu be the least upper bound for F. 
Given any ¢€ > 0 it follows that there is some 29 so that F'(x9) > u—e. Since F is increasing 
we know that if x > xp then u—e < F(a) < F(x) < u, so |F (x) — ul < € which means 


[oe) 
that i fede dim Fay) au. 
xL—>CO 

If the range of F’ is not bounded above then given any number M there is some xo € 

[a,0o) so that Fao) > M, so if x > xp then F(x) > M, which means that lim F(x) = co. 
xwL—->0O0 

Finally, if F is non-increasing then —F' is non-decreasing. Since the range of F' is 
bounded if and only if the range of —F' is bounded, it follows that —F converges if and only 
if the range of F' is bounded, in which case lim —F (a) = —b, the least upper bound of the 

He ©, 0) 
range of —F’, but then 0 is the greatest lower bound of the range of F’ and lim Pig) =b: 
w- CO 
Likewise, if the range of F' is not bounded then it is not bounded below, and so the range of 
—F is not bounded above, from which we conclude that lim —F(x) = oo, so lim F(x) = 
~—00 t—-00 

—oo. 


Theorem 8.8. The Integral Test and Integral Remainder Theorem. Let f : [1,00) — (0,00) 


n 
be a non-increasing function, where f(i) = a; for each i € N. Then the series So ai 
i=1 


love) n 
converges if and only it | f(x)dx converges. Furthermore, if we let sn = So ai for each 
1 


i=1 
nEéEN thenifL= Ss" a; then for any natural number m it follows that sm,+ f(a)dz < 
i=1 m+1 
bop, +f Bee (eas 


x 


Proof. First, note that F(x) = i, f(t)dt is defined by Exercise 6.6 since f is monotone, 
1 
and F is increasing because if 1 < a < 6 it follows that f(x) > f(b) > 0 on [a,b] which 
b 
means that F(b) — F(a) = | f(x)dx > f(b)(b— a) > 0. Likewise, since each a; > 0, 


if we set s, = y a; then {s,} is increasing since each a; > 0 so for n > m we have 


i=1 


n 
8n — Sm = Ss" a; > 0. 


i=m+4+1 


218 CHAPTER 8. SERIES 


Next, we note that, for each natural number 7 it is true that a; > f(a) > aj41 for all x € 


k-1 k-1 pj4d 
[i,i +1]. From this it follows that if 7,k € N so that 7 < k then Sai > >| Foie = 
i=j i=j vt 
k k-1 pid k 
/ f(x)dx = F(k)—F(j). Likewise, it follows that S° nS ei Jods = (x)dx 
j i=j+1 i=j 7? j 


We then observe that if {s,} — ZL then by the Monotone Convergence Theorem, it 
follows that L is the least upper bound for {s,,}. Since i f(t)dt < s, < L for any n € N 
that exceeds x, it follows that F(a) < L for all x € [0, 00) wines means that the range of F’ 
is bounded and has a least upper bound u. By Theorem 8.7, im Pa) = i. fiajdr— au 


Conversely, if {s,} is divergent then {s,} is not bounded above, which means that 


given any number M there is some k € N so that s, > a, + M. It follows that if « > k 
k 


then F(x) > F(k) > So ai = s, — a, > M, so F is unbounded. Thus, by Theorem 8.7, 
i=2 


i Fee ee. 


[o-e) 
Finally, let m € N and let {s,} > L. Then s a, = L— Sm. For each n > m 
i=mt+1 
n-1 n eS) ee) 
we also know that oD a; > f(x)dx which implies that S- a; > f(xjdz, 
i=m+1 fated i=m+1 iil 


0° oo 
so L = Sy + > a; > Sm + f(x)dx by the Comparison Theorem. Likewise, since 


i=m41 m+1 
n ve ore) 
Ss" a; < { f(a)dx it follows from the Comparison Theorem that L = s+ Ss" ag Ss 
i=m+1 i=m+1 


a [ ” f(a)de. 


The following graph illustrates the argument for the proof of the Integral Test. 


219 


Area Under f Less than Sum 


> 


ay 


Area Under f More than Sum 


> 


3) 
Another way of viewing the remainder is as follows: 
[o-e) 
Theorem 8.9. [fan = f(n), where f is a decreasing continuous function so that [ f(x)dx 
i 
2 Seo f(a)da + J, f(a)da 


converges, then for each natural number n, if sn = > aj then 8,;+ 


i=l 
*° f(x)dx — [, f(x)dx ~ 
th f(z) a f(z) of the series sum s = ) Qj. 


i=1 


2 


is within 


220 CHAPTER 8. SERIES 


Proof. By the preceding theorem, we know that s € [s, + f(a)dx, 8, + f(x)dz], 
n+1 n 
°° d. ~ d. 
which means that the midpoint s, 4 Jn ce “ Jn i\a)de of this interval is within a 
f° f(z)dz — i f(x)dx 


distance of half its length from s, which is 


2 


We should also point out that if we don’t mind adding more terms then we have a 


oo oo 0° 
second, simpler formula that x a; = 8, + Rn, where R, = > a < / f(a)dx. If you 
i=1 i=n+1 1% 
estimate the series with a remainder that is bounded in this manner, the remainder is easier 
to find a bound for, but you must add more terms to achieve an estimate within a given 
error typically. 
Also, observe that formula in the theorem would only yield a” <” sign, but if f is strictly 


decreasing we note that over a given interval |i, 5 it is the case that f(x) > irs) > f(i4+1) 
i+1 1 1 

which means that in fact f(x)dx — f(t +1) is at least ae + 5) — f(@i+1)) > 90, 

so the partial inequality signs could be replaced by strict inequality signs if f is strictly 

decreasing. 


Definition 59 


co co 


We say a series oe converges absolutely if SS |a,| converges. We say that 


Ce) =| 
lo) 


yy Gn converges conditionally if it converges but does not converge absolutely. 


n=l 


[oe (oe) 
Theorem 8.10. Jf a series > Gn converges absolutely then > Gn converges. 


n=1 n=1 


[oe 
Proof. Let « > 0. Using the Cauchy Convergence Criterion, if PS Gn converges absolutely 


"=1 
n 


n 
then we can find an integer k so that ifn > m > k then | 3 \a;|| = |a;| < €, so by 
i=m+1 i=m+1 


n lo) 
the triangle inequality | S° a;| < €, and therefore » Gyn converges. 


=m+1 n=l 


221 


(oe) 
Theorem 8.11. Ratio Test. Let {an} be a sequence so that lim paces =L. Then Mae 
Noo 


an n=1 
[oe 
converges absolutely if L <1, and S- Gn diverges if L > 1. 
n=1 
Proof. If L > 1 then for sufficiently large n it follows that | nan —L| < L-1,s0 nar el. 
an an 
[o-e) 


from which we see that the terms |a,,| are increasing, so Ss Gy, diverges by the Divergence 


n=1 


Test. 
Let L < 1. Choose u € (L,1). Choose k& so that if n > k then 


sca 


a 
sails <u. Then 


n 


| <u, so |ap41| < ulag|. Similarly, |ay.2| < ulagyi| < uaz, and inductively we find 
nr 


(oe) 
that ifm = k+ then Ja,| < wu’ |az|. Hence, Ss" Gp, converges by comparison with the 
n=k 
[o-e) (oe) 
geometric series S- a,u”—+, so Ss" Gn, converges absolutely. 
n=1 n=1 


An quick consequence of the Ratio Test is the following theorem, which is sometimes 
helpful as well. 


a 
Theorem 8.12. Let {a,} be a sequence so that lim pees 
n>0o An 


and {|an|} > oo if L>1 


=L. Then {an} 9 0 if D<1 


(oe) 
Proof. By the Ratio Test, if D < 1 then iy Gn converges so {an} — 0 by the Divergence 
n=1 


Test. If L > 1 then choose 1 <u < L. For some k EN, ifn > k then penis 


| > u which 
an 


means that |ax4m| > u™|axz| so {|an|} — co. 


oS (— iF 2 
Example 8.2. Determine whether Ss" 


n=1 


converges, with justification. 


Solution. We take the ratio of the n + lst term over the nth term in absolute value and 
(—1)(n+1)(n+1)? 


take the limit as the ratio approaches infinity. This is lim (aoe In the 
n—0o ee 
nN 
n ; (n + 1)! 
absolute value the (—1)” doesn’t change anything to we ignore that. =n+1 
n) 
_ (n+1)? 
and lim -——,—— = 1 since the numerator and denominator are both second degree with 
noo n 


1 
leading coefficient one. Hence, we get lim ——— = 0 < 1, so by the Ratio Test this series 
noo n+1 


converges (absolutely). 


222 CHAPTER 8. SERIES 


When using the ratio test it is useful to notice that for any polynomial P(x) it is 

true that lim Bead) 
noo P(n) 

determining the convergence of series of rational functions of n since the limit will always 
be one, and factors of expressions which are rational functions will always correspond to a 
(n +1)? 
— 
example above. This means that polynomial factors of numerator or denominator will just 


multiply the resulting limit by one when the limit of the ratio of an, and a, is taken. We 
will prove that here, along with a second component that will help with the Root Test. 


= 1. This means, in particular, that the ratio test is useless for 


ratio converging to one, just as converged to one as n approached infinity in the 


Theorem 8.13. Let P(x) = ayx" + Anya" + +... + a,x + a9 be a polynomial. Then 
P 1 

lim peed) =1 and lim |P(n)|* =i 

n—0o (n) n—00 


n n—1 
Proof. Using the Binomial Theorem, P(n + 1) = an » (") oe + Oy » (") ee eee 
i=0 j 


1 
ay se (") x’ + ao. Hence, the leading coefficient is ay, and the degree of P(n+ 1) is n in 
i=0 


the variable n. Since P(n) is also a polynomial of degree n with leading coefficient a, it 
P(n+1) a 


foll that li a =1. 
ollows that lim P(n) 7 
, , a, In(P(m) «. 
Taking a logarithm we have In(|P(n)|») = ———. Since both numerator and denominator 
In(P cal 
both approach infinity as n approaches infinity, we get lim ine) = lim ot = 0 since 
£00 v 2-00 P(g) 


am 
n 


P'(x) is degree n — 1 and P(z) is degree n. Hence, lim |P(n)|» = e® = 1 by Theorems 
noo 


7.33 and 7.40. 


Definition 60 


Let {x,} be a sequence. Then limsup{z,} = im SUD | On, baat te. We 


also use the notation limsupz, = limsup{z,} for brevity. Likewise, we define 
himint{ se, ) = lim Ih ore ie reiiecle sone e | 3 
Tr [o-@) 


Note that {sup{%n,%n+41, 2n+2,-.-}} is non-increasing and {inf{xp,%n41,2n42,---}} is 
non-decreasing, and thus lim sup{,,}, lim inf{z,,} always exist and are real numbers if {z,,} 
is a bounded sequence (the limsup is always a real number if {z,,} is bounded above and 
lim inf is always a real number if {x,,} is bounded below). If {x,,} is not bounded above then 
sup{@n, Ln41,€n4+2,---} = co for each n € N in which case we say that limsup{z,,} = co. 
Likewise, if {x,,} is not bounded below then we say liminf{x,} = —oo. 


223 


Theorem 8.14. Root Test. Let {a,} be & sequence 80 that lim sup{|a,|"} = L. Then 


a converges absolutely if L <1, and 5s diverges if L > 1. If lim sup{lan|7} = = 


n=1 n=1 
then s Gn, also diverges. 
n=1 
Proof. If L > 1 or {x,} is not bounded above then for sufficiently large n it follows that 
(oe) 
|an|2 > 1, 80 |aq| > 1,80 > Gy, diverges by the Divergence Test. 


n=) 


Let L < 1. Choose u € (L£,1). Choose k& so that if n > k then lan|™ <u. Then 
CO 


lan | <u” for n > k, and thus ) |a,| converges by comparison with a geometric series so 
n=k 


» dy, converges absolutely. 


n=1 


Definition 61 


For a sequence {a,} we will use the notation A oe for n > m. 


17 
Essentially, the capital letter with index subscripts from the letter used to designate 
the sequence can be used to indicate sums from one index to another. 


Theorem 8. - Abel’s Formula. - {a;} and {bj} be sequences and n,m € N so that 


n>m. Then Sab = A Ain bi41 — Bj). 
n—-1 
Proof. Writing out the right side of the equation we have A(p m)bn — > Ai m(big1 — bi) = 


bn (Gm a Am4+1 +... + An—-1 + An) = by, (Gin TF Am+1 ee ya) + Da—a(Gin ag Am+1 ofan SE 
On—1) = Dei (Gang + Gm41+...+ An—2) er Dra Boy oh Cnet) a Dini (Gms) + bmam. The 
n 


only terms remaining after cancellation are Ss" ajb; 


=m 


Theorem 8.16. Dirichlet’s Test. Let {by} be a decreasing sequence whose terms approach 
[oe 


zero, and let {a,} be a sequence whose partial sums are bounded. Then So aibi converges. 
i=1 


n 

Proof. We can choose M > 2| S a;| = 2|Ap,1| for all n € N since {a,,} is a sequence whose 
i=1 

partial sums are bounded. Let € > 0 and choose k € N so that if m > k then by < a 


224 CHAPTER 8. SERIES 


Then |A(nm)| < [Amal +|Anjm+il < M for alln,m € N so that n > m. By Abel’s formula, 


n n-1 
ajbj| = |Ain.m)bn + Aj m(bi — bj41)| < Mbm < € if m > k (since 6; — bj, is always 
: ( ? ) 4 ? 


positive). Thus, ~~ a,b; converges by the Cauchy Convergence Criterion. 
i=1 


Definition 62 


An alternating series is a series of the form a, — a2 + a3 — a4 +... or the form 


—a, + ag —a3+a4+..., where a1, a2, a3,... are all positive numbers. 


Theorem 8.17. Alternating Series Test. Let {bi} be a decreasing sequence converging to 


[oe] 
zero. Then So (-1)i converges. 
i=1 


Proof. Setting a; = (—1)' in Dirichlet’s Test, we note that the partial sums of {a;} are 
(oe) 


bounded and therefore Y ajb; = SS (-1)'bi converges. 
i=l i=l 


While Dirichlet’s Test has merit in and of itself, readers who are not interested in Abel’s 
Theorem or Dirichlet’s Test can simply prove the Alternating Series Test directly as follows. 
This proof also includes the remainder theorem. 


Theorem 8.18. Alternating Series Remainder Theorem. Let {a;} be a decreasing sequence 
[oe] 


converging to zero, and let S(-1)#144 =I. Then, for everyn €N, it follows that L is 
i=1 
between Sp, and 8n41. 


Proof. Since a, > a2 > ag > ..., each (@gn_-1 — Gan) > O and each (—agn + @an4i) < 0, 
which means that a; > a1 + (—a2 + ag) > a1 + (—a2q + a3) + (—a4 +5)... and (a1 — a2) < 
(a1 — a2) + (a3 — a4) < (a1 — a2) + (a3 — a4) + (a5 — ag)..... Since 1 > 83 > $5 >... 
and sg < s4 < sg < .., the odd indexed partial sums are a decreasing sequence and the 
even partial sums are an increasing sequence, both of which converge to L since they are 
subsequences of {s,,} which we know converges to L. Hence, if n is even then s, < LD < 8n44 
and if n is odd then s, > LD > 841. 


l —1 
Theorem 8.19. Log Test. Let {a,} be a sequence of positive terms so that lim ing) = 
noo In(n) 


(oe) [o-e) 
L. Then if L > 1, Sian converges and if L < 1 then So an diverges. Furthermore, if 


n=1 n=1 
In(a;,') 


(oe) [o-e) 
l -1 
= [> 1 then ) Qn converges, and if limsup Ce = LI <1 then ) An 


lim inf 
In(n) oa In(n) n=l 
diverges 
’ ; ee Ca 
Proof. First assume that L > 1. Choose u € (1,LZ). If either lim ——“*—- = L or 
noo In(n) 
In(az! In(az! 
lim inf n(n) = L, then there is some k € N so that if n > k then n(an_) > u, which 
In(n) In(n) 
1 
means that log,, ast > u, so a >n", and thus a, < —. By the Comparison Test and 
n 
(oe) 
Exercise 8.1 it follows that S- Gn, converges. 
n=1 
_ In(a;*) 
Next, assume that L < 1 and choose u € (1,L). If either lim = Lor 
noo In(n) 
In(az! In(az! 
lim sup n(an_) = L, then we can choose k € N so that if n > k then In(@n) <u. 
In(n) In(n) 
1 [o-e) 
Then a, Ween ae Gi = nus, Ss" Gn, diverges by the Comparison Test. 
n 
n=1 
1 
Example 8.3. Determine whether Soy) converges. 
<~" In(n) 
In(B2yn) InnIn(1 
Solution. Since lim bales = int pee) = lim In(In(n)) = co > 1 we 
n—>0o In(n) n— 00 In(n) n—0o 


ae 
conclude that § *(——)!"™ converges. 
d In(n) 


CO 
Theorem 8.20. Comparison Remainder Theorem. Let |\an| < bn for alln € N where Ss" bn 


n CO CO vi 
converges. Let 8) = So ai for each n € N and let So an =s. Then |s — S| < S- b;. 
i=l n=1 t=n+1 
k k k 
Proof. Let n € N. Since | So ail < S- lai S So bi for each natural number k > n, it 
i=n =n i=n 


[o-e) (oe) 
follows that |s — s,| = | Ss" ai] < SS b; by the Comparison Theorem. 
i=ntl i=n+l1 


226 CHAPTER 8. SERIES 


An interesting property of series convergence relating to absolute convergence relates 
to the idea of adding a rearrangement of the terms of a sequence. Absolutely convergent 
series are series so that if the series terms are rearranged in a different order then the 
sum remains unchanged when the terms are added. For conditionally convergent series, 
however, the order in which the terms is added is critical, and you can get any sum you 
wish by appropriately rearranging such a sequence. 


Definition 63 


Let g : N > N be a one to one and onto function, and let {a,} be a sequence. 


Then {ag(n)} is a rearrangement of {an}. 


(oe) 
Theorem 8.21. Let a, > 0 for each n € N and let S> Gn = L. Let S be a finite subset of 
n=1 
the natural numbers. Then » Qn < L. 
nes 


n 


Proof. Let 8s, = Ya denote the nth partial sum of {a;}. Then {s,,} is increasing and 


i=1 
converges to its least upper bound L. Let M = max(S). Then ~ Gdn < Sy < L since the 
nes 
terms added in syy are all non-negative and include every summand in S- Qn- 
nes 

[oe 
Theorem 8.22. Let S° Gn be an absolutely convergent series with sum s. Letg:N—N be 

n=1 


a one to one and onto function, and let {bn} = {@g(n)} be a rearrangement of {an}. Then 


3 bn = 8. 
n=1 


[o-e) 
Proof. Let ¢€ > 0. Let L = > |an|. Choose a natural number & so that if n > k then 


n=1 
n-1 


D- > |a;| < €, so 3 |a;| < «. Let N = max{g(1),...,g(k)}. Then for all 1 <i<k it is 


1=1 =n 


true that a; = g(j) for some 1 <j < N. 


Let m > N. If we look at the difference y ay — > b;, all terms a; for 1 <i < k cancel 


=1 i=1 
with terms b; for some 1 < 7 < N < m. The terms remaining are terms Lon the foree +a; 


for i > k. Let S be the set of indices 7 so that -ka; is a summand of wee: ye for 
7=1 


227 


some i < m. In other words S = {i € N|i < m and a; 4 6; for any 1 < j < mor a = b; 
CO 


for some 1 < 7 < m so that 6; ¢ a; for any 1 < i < m}. Then S- |ail < ae lai] < € 
ieS i=k+1 


m m 
by Theorem 8.21. Thus, | Soa _ S "bil < Ss" |a;| < ¢€. From this we conclude that the 
i=1 i=1 ies 


m m 
sequence {s a— So bi} — 0, so s bn = 8. 
i=l i=l n=1 


(oe) [o-e) 
Theorem 8.23. Let Ss" Qn = LER. Then Ss" Gn is conditionally convergent if and only 
n=1 n=1 
if there are subsequences {a,(;)} consisting of all non-negative terms of {an} and {anv} 
[o-e) [oe 


consisting of all negative terms of {a,,} having the property that S- Ap(i) = CO and S- An(i) = 
i=1 i= 
—oOo. 
[o-e) 
Proof. First, assume that 24 is conditionally convergent. Note that if there are only 


=1 
finitely many negative terms of {an} then if we set S = {i € Nla; < 0} we would have, 


for any integer m > max($), that in la;| = = a; +2 a |a;|, which means that 3 lag = 
ieS 


[oe 
L+2 S- |a;|, so S> Gy, would be absolutely convergent. Similarly, if there were only finitely 
ie€S n=1 
many indices 7 so that a; > 0 then the series would be absolutely convergent. Thus, there 


are subsequences {a,;)} and {a,,;)} of {an} so that p(z) and n(z) are the ith non-negative 
terms and negative terms, respectively, of {ay}. 
[oe) 


Next, suppose that Ss" Api) = P and S- An(s) = —00. Let M <0. Then we can choose 
i=1 


k so that if i > k then Sani) <M-—P. Let m> n(k). Set S = {t € Nip(t) < m}. Let 
i=1 _ 
T = {i € N|n(i) < m}. Then So ai = So ang y+ S° ani: We know that S © apc) < P by 
i=1 iE€S ieT ie S 
Theorem 8.21, and we know that S- An (i) Consists of a sum of negative terms including all 
ieT 

m (oe) 

An(¢) With 7 < k, which means that Ss" Ani) < M —P. Thus, So ai < M, so S> a= —-X, 
i€T i=1 i=l 

(oe) 

which contradicts S- dy, = L. It follows, similarly, that it is impossible for ye An(i) = N 

n=1 i=1 


and S- Api) = OO 
i=1 


228 CHAPTER 8. SERIES 


Suppose that S> Qp(;) = P and Ss" Ani) = N. Then yg |ap()| = P and 3 |en(i —N. 
i=1 i=1 
Then for any natural number m it follows that if we 2a S=f{ieNn pCi ‘et my} and 


T = {i € N|n(t) < m} then Ya = = Slap 1+ S 2 lana < P—N. Hence, Ya 
ES ieT 
converges to some value less a or equal to P—N. 


We conclude that if Ss" Gy, is conditionally convergent then iF Api) = 00 and ye An(i) = 


n=1 i=1 
—o. 


[o-e) (oe) 
Finally assume that Ss, Api) = 00 and S> An(i) = —00. Then given any number M > 0 


i=1 
k p(k) k 
we can find a natural number & so that Sy Api) > M, so S> |aj| = S° Api) > M, which 
i=1 i=1 i=1 
[oe) [oe) 
means that S° |a;| = co so Se Gy is not absolutely convergent (and must be conditionally 
i=1 n=1 


convergent). 


(oe) 
Theorem 8.24. Let S- Gn be a conditionally convergent series. Then if L € R or L = +00 


n=1 
oo 


then there is one to one and onto function g: NWN so that S agi) = L. 
i=l 


Proof. By Theorem 8.23, we can find sequences {a,(;)} consisting of all positive terms of 
(oe) 


{an} and {a,,;)} consisting of all negative terms of {a,,} having the property that S- Api) = 


i=1 
[o-e) 
co and ) An(i) = OO 
i=1 
maz 


First, assume that L = oo. Let mj be the first integer so that Ss" api) > 1. Then let m2 


i=l 
me 


be the first positive integer exceeding m , so that De Ap(i) + Gn(1) > 2. Inductively, choose 
i=1 
: Mr+1 Mk 

mr+1 to be the first positive integer exceeding mz so that S> Ap(i) + S- an(i) >k+1. 


t=1 
oo 


We can always make such a choice since S| api) = oo. We then define rearrangement 
i=l 
fh} = {@p(1) 5 Up (2)> +++) Up(my)> Fn(1)> @p(my+1) 1 Sp(m142)> ++» Ulm) An(2)> Up(m241)> ...}. Note 
t 


that for any positive integer s it is the case that ifm, +s < t then So bi > s. Thus, 
i=1 
¢ a 
> b; = oo. Similarly, we can find a rearrangement {b;} whose sum is —oo. 


229 


Next, let L € R. We make the rearrangement choices as follows. We set b) = a1). If 
k 
we have chosen the first k members of the rearrangement then if oe b; < L we choose bx41 
i=1 
: k 
to be a,¢), where t is the first index so that a,j) ¢ {bj, be, ..., bp}. If S> b; > L we choose 


i=l 
be41 to be Gps), where s is the first index so that ans) ¢ {b1, b2,..., bh}. Thus, the partial 
sums of {b;} are non-decreasing until they exceed L, then non-increasing until the precede 
[o-e) 


LE and then non-decreasing until they exceed LE again, and so forth. Let € > 0. Since > An 
n=1 
is convergent, we can find an integer m so that if i > m then |a;| < ¢. We will assume 
m m [oe 


that Ss b; < L (the case where Sob: > L is similar). Since Ss" Ap(j) = CO there will be a 
i=l i=l i=l 
oat 
first 71 > m so that XS b; > L. Then since 3 An(i) = —00 there is a first 72 > j1 so that 
. i=1 i=) 
jJ2 
So bi < L. For all 71 <j < je it follows that L+ bj. < s; < L+;, (where s; = yo 
=. < 
Likewise, if j3 is the first integer so that j3 > jo and S° b; > L then for all jg < 7 < J it 


i=l 
follows that s; is equal to or between L + b;, and L + bj, and so forth. Thus, since each 
oe) 


|b;,| < € it follows that |L — s;| < € for all j > j;. Hence, So bi =: 
i=1 


The following weaker form of Stirling’s Formula is often helpful in using the Log Test 
to determine the convergence or divergence of series that involve a factorial: 


Theorem 8.25. Weaker form of Stirling’s Formula. For each C > 1 there is ak €N so 
that if n is a natural number larger than k then nln(n)—n < In(n!) < nIn(n) -—n+Cln(n). 


Proof. First, note In(n!) = In(1) + In(2) +...+In(n > In(i). For any natural number n 


n 
we note that since In(z) is an increasing function, it must be the case that me In(x)dx < 


n—-1 
n+1 ue a n 
In(n) <|/ In(x)dx. Thus, fm In(x)dx < Dnt < yf eda: so | In(a)dx < 
n fag Vi-l 1 
n+l n+1 


In(n!) < | In(a)dx < / In(a)dx. Evaluating these integrals, we get nIn(n)-—n+1 < 


In(n!) < ie +1)In(n+1) = +1)+1. Thus, (nIn(n) —7n) < In(n!) < (n+1)In(n+1)—7n, 
which means that 0 < In(n!) — (nIn(n) — n) < nIn(n + 1) — nIn(n) + In(n+ 1) + (n— n), 


1 1 1 
so 0 < In(n!) — (nIn(n) — n) < nIn(1 + 7? +In(n +1). Notice nIn(1 + ) = In(1 + =a 


1 
which approaches In(e) = 1 as n approaches infinity. Also, In(n + 1) — In(n) = In(1 + —) 
n 


230 CHAPTER 8. SERIES 


approaches zero. Thus, if « = C —1 > 0 then we can find k € N so that if n > 
1 
k then nIn(1 + 7) <2 5 In(n) and In(n+ 1) < (1+ 5) n(n) which means that 


nin(1 + =) +In(n+1) < (14+ 6) In(n) =Cln(n). 


We will refrain from giving a list of hints for the exercises in the remaining sections. 
It is hoped that readers have a fairly good idea where to start looking at definitions and 
theorems relating to a problem at this point, having practiced the process in earlier sections. 


231 


Exercises: 


[o-e) 
1 
Exercise 8.1. Prove that y —, converges if and only if p > 1. 
n 


n=1 


n 


n a 
Exercise 8.2. Let a, = —. Prove that lim Hd <2 
nl Z>0O An 


Exercise 8.3. Determine whether each series converges or diverges, with justification. 


(0) 
(0) 3 cost) 
(c) yewra = ay 
Cy. 
es 


(oe) 


Exercise 8.4. Give an example, with justification, of a series S(-1)"an which is divergent, 


n=1 
where an > 0 for alln and {a,}— 0. 
(oe) (oe) 
Exercise 8.5. Give an example of a series So (-1)"an which is convergent and S7(-1)" ln 
n=1 n=1 


which is divergent so that (7) > L#0. 


Exercise 8.6. Let |f'")(x)| < My on the interval [a,b] for all n € N, and let c € [a,b]. 


OO f(r) er M.(b—a)" 
Exercise 8.7. Give an example of series So an and So bn which both converge, so that 
n=1 n=1 


CO 
> Gnbn diverges. 


n=1 


232 CHAPTER 8. SERIES 


[oe) CO CO 
Exercise 8.8. Prove that if” An and S> by, converge absolutely then S> Anon, must also 


n=1 n=1 n=1 
converge. 


Exercise 8.9. Prove that if lim sup{|en|*} > L <1 then lim c, =0. 
N—->Oo 


233 
Solutions to Exercises in Chapter 8. 
1 
Solution to Exercise 8.1. Prove that ee — converges if and only if p > 1. 
n= ‘te 


b> 00 


[oe) 
1 
Proof. Using the integral test we have / ae = lim In(b) = oo if p = 1 and otherwise 
1 os 


a ee, = =ooifp<1 : ifp>1 
1 we pm (pam tO NPS bor Ty p>) 


n 


Solution to Exercise 8.2. Let a, = - Prove that lim sua e€ 
n! zL->00 An 
1 n+1 | 1 n 1 
Proof. We take lim es ae = lim eee = lim (1+ —)" =e. 
noo (n+1)! n™ nooo ne” n—+00 n 


Solution to Exercise 8.3. Determine whether each series converges or diverges, with 
justification. 


(a) 3 nm 
(b) yy as 
(c) yewra =o 
(4) = int) 
= 


1)! n” 
Proof. (a) Using the Ratio Test and the preceding exercise we have lim tla ele = 
noc (n+1)"+! n! 


1 
— <1. Thus, the series converges. We could also have used the Comparison Test with —> 


the Log Test, or even the Root Test (though the Root Test is a bit Bare 
| cos(n)| 


(b) Using the Comparison Test we have that ae Ea 
n n 


(oe) 
Exercise 8.1, it follows that Ss" oa 
n 


n=1 


n 
(c ? We can use the Divergence or Log Tests. If we set u = =5 then n = —2u, so 


. Since ee —z converges by 
n 


n=l 


converges absolutely and therefore converges. 


{(1- “yn = - + =)" ‘ie 5 # 0, which means that (—1)"(1 “yn / 0, and hence 


[o-e) 
So(-1 "(1 — —)" diverges by the Divergence Test. 


n=1 


234 CHAPTER 8. SERIES 


(d) We can use the Comparison or Log Test. For the Comparison Test, we note that 
x > In(x) for sufficiently large x (actually, for all x > 0, but we do not need that). One 
In(zx) . i 


way to see this is using L’Hospital’s Rule to find lim = lim ~~ = lim —=0. 
~L—00 af L000 aE L00 4/7 


In(z) 


Thus, for some M if « > M then < 1 which means that In(x) < x. Once this has 


been established we can use the Comparison Test by noting that for sufficiently large values 


uate In(n) | Jn 1 soma , 
of n it is true that —j- < = We know that ) —z converges by Exercise 8.1, 
n 
n=1 1? 


z¢ 
n2 n2 


[oe 
l 
and so SS Bu converges by the Comparison Test. 
n=1 a 


2 


In(my? i 2 In(n) — In(In(n)) 


The Log Test may be shorter. You take lim ————— = lim = 
n—co In(n) noo In(n) 
lim 2 — ae) = 2 > 1 since lim Ua lim — : = lim = 0 by 
noo In(n) n—rco n(n) noo 1 nIn(n) noo In(n) 
L’Hospital’s Rule. 
: ; In(In(n))'") E . 
(e) Using the log test we have lim —-——~——— = lim In(In(n)) = ov, so the series 
n—0o In(n) n—0o 
converges. 
Solution to Exercise 8.4. Give an example, with justification, of a series So(-1)"@n 
n=1 


which is divergent, where ay, > 0 for all n and {a,}—> 0. 


1 1 
Proof. We could use a, = — if n is odd, and ay, = oa when n is even. We know that 
n 
[o-e) 
) — = oo by Exercise 8.1 since the partial sums s, of this series are increasing and thus 
n 


n=1 
only diverge if {s,} is not bounded above and hence becomes and remains larger than 


(oe) 
1 1 
any given number for sufficiently large n. Thus, oS — = o and since — < 
= 2n 2n 2n — 1 
love) us 
it follows that Ss" = co by the Comparison Test. Thus, —1 as > 
4+ 9n—1 ¥ P an set 
as 1 
—oo whereas Ss" am > 3° Hence, given any M < 0 we can find an integer k so that 


n=1 


—1 1 
> ako 1 =< M- 3 which means that if N = 2k +1 then if 7 > N we have that 


=1 
"i ky N 4 1 ol j 
n me n _ 
ae Gn < d, Te ps ga <M-3+3=M. Hence, 2) Gn = —00. 


235 


(oe) 
Solution to Exercise 8.5. Give an example of a series So (-1)"an which is convergent 
n=1 
and ae )"b,, which is divergent so that {- eS > L#0. 
1 1 (—1)"+1 bn (—1)"*1 : 
Proof. Let = — and let db, = . Then — = 1 + ——— which 
roof. Let {an} va and let by a 55 en ay + Ta whic 
converges to 1 4 0, so = — 1. By the alternating series test, ae "a, converges, but 


p=) 


n + (apr 1 ; (—1)” 
So(-1) b= ye a We know that the partial sums of 3 Ta are bounded 


[o-e) 
1 
and that the partial sums of y — diverge to infinity, from which we can conclude that 
n 


n=1 


> 


n=1 


Solution to Exercise 8. 6. my a. )(x)| <_My on the interval [a,b] for all n € N, and let 
M,,(b — a)” 
(n)! 
Proof. This follows immediately from the ay at and the second form of Taylor’s 


fO (x - c)! fPM (e)(ax _ a)r+1 
-> F (ntl! 


€ [a,b]. Then f(x ro ie Su for all x € [a, }] if im =A) 


Theorem, which states that f(x Since we 


fOD(e) (a _ ayer eae — ay n+l 


k that 0 < & , by the § Th ; 
now that 0 < | (n +1)! fe in +1)! y the Squeeze Theorem, we can 
(n+1) _ \ntl OO £(n) — en 
conclude that lim u (Gia) = 0, which means f(x) = SS rable 
noo (n + 1)! n=0 n! 
[oe [oe 
Solution to Exercise 8.7. Give an example of series Ss" Gn and Ss" bn which both converge, 
w=1 n=1 
[o-e) 
so that Ss" Gnd, diverges. 
n=1 
(—1)” : 1, 
Proof. Take an = by, = Tae Their product — is a sequence whose sum diverges by 
n n 


Exercise 8.1, and the original series converge by the Alternating Series Test. 


[o-e) [o-e) (oe) 
Solution to Exercise 8.8. Prove that f>> An and S- bn converge absolutely then y Anbdn 
n=1 n=1 n=1 
must also converge. 


236 CHAPTER 8. SERIES 


(oe) 


Proof. Since Ss bn, converges we know that {b,,} — 0 be the Divergence Test, so for some 


n=) 
oo 


k ifn > k then |b,| < 1. Thus, |anbp| < |a,|. Since S- |a,| converges, by the Comparison 


n=1 
oo oo 


Test we know S- |@nbn| converges, which implies that se Anby, converges. 


n=1 n=1 


Solution to Exercise 8.9. Prove that if lim sup{|cn|" } +L <1 then lim GH 
noo 


[o-e) 
Proof. Using the Root test we see that Sien converges, so {c,} — 0 by the Divergence 


n=1 


Test. 


Chapter 9 


Sequences of Functions 


Definition 64 


Let f, : D — R for each n € N. We say {f,} — f for a function f: DR 


(or converges pointwise to f) if, for each « € D, {fn(x)} > f(x). We say that 
{fn} > f uniformly on D if for every « > 0 there is a k € N so that ifn > k then 
|fn(x) — f(x)| < € for all x € D. 


Most useful properties are not preserved under pointwise convergence of sequences of 
functions, but many are preserved under uniform convergence. 


Theorem 9.1. Let fn: D — R be continuous at xo for each n € N, and let {fn} > f 
uniformly on D. Then f is continuous at xo. 


Proof. Let « > 0. Choose k € N so that if n > k then |fn(x) — f(x)| < 5 for all x € D. 


Choose 6 > 0 so that if |~—ao| < 6 and x € D then | f(x) — fe(x0)| < 7 Then if |a—ag| < 6 


and x € D, it follows that | f(x)—f(xo)| < | fa(x)—f(2)|+|fa(x)— fe (20)| + fa(%0)—f(20)| < 
e. Thus, f is continuous at Zo. 


Theorem 9.2. The Uniform Cauchy Criterion. Let fy, : D— R for eachn € N. Then 
there is a function f so that {fn} > f uniformly on D if and only if, for every € > 0, there 
isak€N so that ifn,m>k then |fn(x) — fm(x)| < € for all x € D. 


Proof. Assume first that for every € > 0 there is a k € N so that if n,m > k then |f,(a) — 
fim(x)| < € for all 2 € D. For each x € D we note that for every « > 0 there is ak € N so 
that ifn,m > k then |fn(x) — fm(x)| < €. Hence, {fn(x)} is a Cauchy sequence which by 
previous theorems converges to a point which we define to be f(z). 


Let « > 0. Then we can find a k € N so that if n,m > k then |fn(x) — fm(x)| < 5 for 
all zs € D. Thus, |fp(x) — f(x)| = lim lfn(z) — fm(x)| < 5 <e,so {f,}— f uniformly. 


237 


238 CHAPTER 9. SEQUENCES OF FUNCTIONS 


Next, assume that there is a function f so that {f,} — f uniformly on D and let € > 0. 
Then choose k € N so that ifn > k then |f,(a) — f(x)| < * If n,m > k then it follows 


that | fn(a) — fm(a)| S| fn(@) — F(@)| + |fm(@) — F(@)| <€. 


Theorem 9.3. Let f,: D — R be integrable on [a,b] for each n € N, where {int converges 
uniformly to the function f(x) on a closed interval [a,b]. Then ft fa} 7 ie uniformly 


a 


on |a, |. In particular, lim A in(e)ae =f f(x)dx 


Proof. First, if we let E; be the set of points at which f; is discontinuous and E be the set 
[oe) 
of points at which f is discontinuous then EF C U E;, by Theorem 9.1, and so \(E) = 0 


i=l 
by Theorem 7.62, and thus f is integrable by the Lebesgue Characterization of Riemann 
Integrability. Let « > 0 and let k be an integer such that if n > k then |f(x) — fr(x)| < 


ae ) for all « € [a,b]. Choose m > k and x € [a,b], and select a partition P = 
—a 


{05.015 Foy. Cpt} of [a, x] so that Uf P) r- EA Fit) < 5 and U(f, #) 7 ae, P) x 


win 


Choose any marking T = {2j,...,2,,} of P. Then |Sr(fm,P) — Sr(f, P)| = ou 


* E ” _ € _ € _ 
f(at))(0; — 24-1)| < oa — ter = O- ae) = §- Thus, oe 
[ssf be = Sens PI + 1S P) = Sef P)| + 1Se(fP) — ff] < e Ths, 
. x is x b 
(| fa} > f f uniformly on [a,b] and jim. [ In(a)ae =f f(x)dx 


Theorem 9.4. Let f, be continuously differentiable on [a,b] for each n € N, and let { f/,} > 
f' uniformly on [a,b], and for some s € (a,b) let {fn(s)} > f(s). Then {fnr(x)} > f(x) 
uniformly on |a, 6]. 


Proof. Note that each f,(x =f fi (t)dt+ fr(s) by the Fundamental Theorem of Calculus, 


since each f/, is continuous and therefore integrable. Since {f,} — f’ uniformly on [a, )], 
it follows from 9.1 that f’ is continuous on [a,b] as well so by the Fundamental Theorem 


of Calculus f(x hee f'(t)dt + f(s). Thus, by Theorem 9.3 it follows that {f,(x)} = 


f fi, (t)dt+ fnr(s)} > ie: ‘(t)dt+f(s) = f(x) uniformly on [a, b| and therefore { f,(a)} > 
f(x) uniformly by Exercise 9.9 . 


239 


(oe) 
Theorem 9.5. Weierstrass M-test. Let |fn(x)| <M, on E for each n € N and let > My, 


=1 
oo 


converge. Then > fn converges absolutely and uniformly on E. 
n=1 


Proof. Absolute convergence follows directly from the Comparison Test. Let « > 0. Since 


[o-e) 
Mn converges, by the Cauchy Convergence Criterion we can find k € N so that if 


n=1 
n 


n>m>k then M; <, so | > hla) < > liGe)| << SS Mye for all x € E. 


=m4+1 i=m-+1 =m+1 =m4+1 
ioe) 


Hence, > fn converges uniformly on & by the Uniform Cauchy Criterion. 


n=1 


Definition 65 


(oe) 
A series of the form Sy Cn(a — a)” is called a power series centered at a. We say 
n=0 


that a function f is analytic on an interval (s,t) if for every xo € (s,t) there is an 
[o-e) 


ee 0 soa > Cn (x — 20)” for some coefficients c,, for all x € (ao — €,20 + €). 


n=0 


oS fm) — a)” 
If f is infinitely differentiable on (s,t) then we call Se f ake o) the Taylor 


n=0 
series for f centered at xg. If x9 = 0 we call the series a Maclaurin series. 
[o-e) 


If a power series Sy Cn (x — a)” converges only if « = a then we say the radius of 
in=0 
convergence of the power series is zero. If there is a positive number R so that the 


series converges on (a — R,a+ R) but diverges on R \ [a — R,a+ R] then we say the 
series has radius of convergence of the series is R. If the series converges on R then 
we say that the radius of convergence is oo. The interval of convergence of a power 
series is the set of values for which the power series converges. 


We will show that the ”interval of convergence” is an interval in theorems that follow. 


[oe] 
Theorem 9.6. Let a Cn(x — a)" be a power series. Then: 
n=0 
(a) Exactly one of the following is true: 
[oe) 


(1) Ss" Cn(a — a)” converges only at the point a and so the radius of convergence R of 
n=0 


1 
the series is zero, which is true if and only if limsup |cn|" = oo. 
[oe] 


(2) eS Cn(a — a)” converges absolutely for all real x and the radius of convergence R of 
n=0 


1 
the series is infinity, which is true if and only if limsup |cp|" = 0. 


240 CHAPTER 9. SEQUENCES OF FUNCTIONS 


[oe) 
(3) There is a positive number R so that So ena — a)" converges absolutely on (a — 
n=0 


R,a+R) and diverges for alla € R\ |a—R,a+ R], meaning that the radius of convergence 


of the series is R, which is true if and only if limsup len|® =F 


(b) Ifr > 0 then ae — a)” converges absolutely on (a—r,a+r) if and only if 


: n=0 
rlimsup |cn| < 1. 
(oe) 
(c) If S- Cn(a — a)” converges on an interval (a— R,a+ R) and [c,d] Cc (a— R,a+ R) 
ae n=0 
then SS Cn(a — a)” converges uniformly on [c,d]. 
n=0 


[oe) 

Proof. We use the Root Test. First, note that a Cn(a — a)” = co if x = a regardless of 
n=0 

choice of {cy}. 


(a) If {len|=} is unbounded (meaning lim sup lenl™ = oo) if and only if lenl |x — al is 
unbounded if z 4 a, so lim sup |en| |a —a| = oo if and only if limsup len| 7 = oo unless 
[o-e) 


x =a. In this case, ) Cn(a — a)” diverges, so the radius of convergence is zero. 
n=0 


If {cy} is bounded then lim sup |cp| niga non-negative number. If « 4 a then lim sup |cp| me 
(oe) 
0 if and only if lim sup |c,,| i |x —a| = 0 by Exercise 9.1, in which case S° Cn(a—a)” converges 


n=0 
absolutely for each x € R by the Root Test, so the radius of convergence of the series is 
infinity in this case. 


=I 


If lim sup len” is positive then let R = (limsup lenl*) . Then lim sup lenl = |x —al= 


lx — al lim sup |en|™ by Exercise 9.1. Since |x — al lim sup |en|™ <1 if |z-—a| < R and 
(oe) 
|x — a| lim sup lenl™ > 1 if |a—a| > R, we know that iG Cn(a — a)” converges absolutely if 


n=0 
|x —a| < R and diverges if |2 —a| > R by the Root Test. Hence, the radius of convergence 
of the series is R in this case. . 
(b) As noted above, if rlimsup |c,|" < 1 for some positive number r then if |a —a| <r 
(oe) 


we know that |x — a|limsup len| 7 < 1, so the series S- Cn(a — a)” converges absolutely for 


n=0 
oo 


all x € (a—r,a+r) by the Root Test. Conversely, if S- Cn(x—a)” converges (absolutely) for 


all x € (a—r,a+r) then that means either the stidiue oF convergence of the series is infinite 
and thus lim sup len|® = 0 by part (a) and hence r lim sup lenl> = 0 < 1, or the radius of 
convergence is equal to some positive number R > r, in which case R lim sup len| = 1 by 
part (a). Since r < R it follows that Rlimsup len|™ sie 

[oe) 


(c) If Ss" Cn(a — a)” converges on (a — R,a+ R) then the radius of convergence of the 


n=0 


241 


1: 
series is at least R so w = limsup len| 7 < R We can find a positive number u < FR so that 
[c,d] C (a—u,a+u) and lim sup |en|* |x —a| < uw < 1 and |cn(x# — a)”| < (uw)” for all 
[oe) 


x € [c,d]. Hence, if we set M, = (uw)” then ‘Sy My, is a geometric series which converges, 


n=1 
oo 


and so S- Cn (x — a)" converges uniformly on [c,d] by the Weierstrass M/-test. 


n=0 
CO CO 
Theorem 9.7. Let f(x =e Cn(a—a)" be a power series and let R>0. Then Ss" om 
n=0 n=0 


a)” converges on the interval (a— R,a+ R) if and only if So nen (x —a)"~! converges on 


n=1 


(oe) 
, ; : : ‘ Cn 1 
the interval (a— R R), which is t d onl y ier 
e interval (a ,a+ R), which is true if and only pare a) 


(a — converges on 


the interval (a — R,a+ R). 


1 
Proof. Note that lim sup n* |en|” = lim sup |e,|™ = lim sup(— since ae > 1 


[oe) 
by exercises 9.4 and 9.1. Since each of a Cn (x — a) Dy NCy(x— a)” 
n=0 
converge on the interval (a — R,a+ R) if and mA if Rlim sup len| 7 n < Fi the result 
follows. 


am 1{* 
il 


Theorem 9.8. Let f(x = Yoon x —a)” for allx € (a—R,a+R). Then f'(x) = 
So nen (z—a)"" and the derivative pe (ea is f(x) for alla € (a—R,a+R). 


ae For each x € (a — R,a + R) we can choose [c,d] so that a,x € (c,d) C [c,d] C 
(a — R, a+ R). By Theorem 9.6, f converges uniformly on [c,d] and by Theorem 9.3, 


f Yat —a)'dt} = o> a (elo ih ss / f(t)dt uniformly. By the first form 
=0 i=0 tte 


of the Fundamental Theorem of Calculus, f(x ae f (t)dt)’ (p= ayrry 


i 


By Theorem 9.7 and Theorem 9.6, g(x > ncp(x — a)"~* converges uniformly on [c, d]. 


Hence, as before, by Theorem 9.3 we know that a g\dt-= Ss" Cn(a — a)” = f(x) — cp. 


Thus, f’(x) = (f(x) — co) = g(2). 


242 CHAPTER 9. SEQUENCES OF FUNCTIONS 


f(a) 


Theorem 9.9. Let f(x) = Sle —a)” for allz € (a—R,a+R). Then cy = 7 


n=0 


for eachn EN. 


Proof. By Theorem 9.8, the function f is infinitely differentiable on (a— R,a+ R), f'(x) = 


oe) 


So nen (x Sart ae) S Si n(n — 1)en(a — a)"~? and inductively, f(x) = SE n(n — 

n=1 n=2 n=k 

1)(n — 2)...(n — k + 1)en(a — a)"~* for each k € N. Thus, f(a) = kleg +0+0+4..., so 
_ f(a) 

hE 


Theorem 9.10. Let f be an infinitely differentiable function on an interval I = [a,b] so 
that, for some M > 0, |f'™(x)| < M” for all x € I and eachn € N. Then f(x) = 


SF (a)(z@ — a)” 
> (a)(@ — a) 


converges uniformly on I. 


n! 
n=0 
mn (i) _a\i (n+1) _ q)ntl 
Proof. By Taylor’s theorem, f(x) = Ss" ui ta a) u tae for some 
1=0 
; ue fO(a\(a ~_ a)! M"™l(a 2 ayn ; 
point c € [a,x], which means that |f(zx) > Al | in +i)! which 


(a)(a— a)" 


ales 1) 
converges to zero. Hence, S- 
n! 


n=0 


converges uniformly to f(z). 


Theorem 9.11. The following are Maclaurin series converging to the functions listed on 
the intervals listed. In each case, we refer to the nth degree Taylor polynomial centered at 


Pr FO (0)a% 
zero as Ty(x) S> aoe, and the Taylor Remainder after the nth power term as Rn = 
i! 


i=1 
af FD (4) (a —t)"*dt. We also uses T(x) to refer to os oo the Maclaurin 
n: JO ae a) 
series for f. 
1 CO 
(a) = =e =l+atar tart... on (—1,1). 
n=0 
1 CO 
Oe = SO(-1)"2" =1-2+2?—-23 +... on (-1,1). 
x 
=0 
oo n 2 3 
In(1 +2) = VP cceig ty Fees oe A, 
(o) m(l +2) = DI se FFF om 
0° 2n+1 3 5 
= (on x x 
(d) tan“*(x) = S°( crear eras on [—1,1] 


243 


(oe) 
ie OO TY) eA ED) pe a a(a-1) 9, ala—Dla=2) 4. 
(e) (1+2) ey 1 1l+azr+ a 3) x4 
—1 — 2)...(a— 1 
n (—1,1), wherea € R. We refer to ao Te ) (rma) as (“) so we can 
n! 
(oe) 
write (1+ 2)* = Ss" eed) x”. This Maclaurin series is referred to as the Binomial Series. 
n=0 
Cnn 2 3 
a z — | a | me | —, 
(f) e Oa aie +... on (—0o, co) 
ee 2n+1 3 5 
: g i 
(g) sin(x) = Ss ns (n+)! =2- 5 BE On (—o0, co) 
n=0 
oo 2n 2 4 
= a ck 2 
(h) cos(x) = df 1) (an) = 1 ot gp on (—oo, 00) 
oo 2n+1 3 5 
te z 1 an 3 
(i) he geen? al 31 fey on (—co, 00) 
oo Qn 2 4 
x ey ae 
(j) cosh(a) = Ss" (On! S14 ata +... on (—00, oo) 


Proof. We demonstrate a useful preliminary observation that lets us extend Taylor series 
convergent at ier a aie to corresponding function values: 


LG => i =a) = f(x) for a C™ function f on an interval (a — €,a+ €) 


then if f is ree at a+e and T converges uniformly on some interval [c,a + €] then 

and T is continuous at a +e by Theorem 9.1 and therefore T(a + €) = f(a +) because 

fia+e)= lim f(x) = lim T(x) =T(a+e). Similarly, if T converges uniformly on 
rate es a ol 


some interval [a — €,c] and f is continuous at a — e then f(a—e¢) =T(a—e). 

We sometimes integrate or differentiate power series term by term in the remainder of 
this proof. This is justified (on the interior of the interval of convergence) by Theorem 9.8, 
but we will not reference this theorem every time it is used. 

(a) This follows directly from Theorem 8.6. 


1 =. nr =. n,n 
(b) Set u = —a. By part (a) we know ta doe = So(-1) x” when |—2| = |2| < 


n=0 
1. 


1 1 
(c) Since i dx = In(1+ x2) +C it follows that, integrating the series for ; 
14+2 l+az 
[o.e) 


n 
we have In(1+ 2) = k4 y ( yrs for some constant k. Setting « = 0 we sce that 
n 


n=1 
k = In(1) = 0, so n(1+ 2) = Saye = T(x) on (—1,1). 
n=1 


1 
By the Alternating Series Test, this series converges when x = 1 and |T,(x) —T(x)| < — 
n 


on {0, 1]. From this we conclude {T,,(x)} converges uniformly on [0, 1], so by the observation 
above we see that T(1) = In(2) and T(x) = i +z) on (—1, 1]. 


(d) Set u = x”. By part (b) we know 7 —— =So-1 = S0(-1)"2" if |x*| <1 
n=0 n=0 


244 CHAPTER 9. SEQUENCES OF FUNCTIONS 


gent 
which is true on (—1,1). Integrating, we have Im zdaz =k +4 ae . Setting 
i : oo pont 
=0 that & = tan’ °(0) = 0 so tan™ = r= 1,1). 
z we see tha an (0) so tan “(z) d| ) a on (—1, 1) 


By the Alternating Series Test this series converges at both —1 and 1. Furthermore, 
n—T =e nu _ as 

Ion —T(2)| = [Ban-a(2) -T(a)| < ss on 
Taylor series) so the convergence of {T;,,(x)} is uniform on [—1,1] so T(x) = tan7!(x) for 
all x € [—1,1] by the observation above. 

(ec) We take derivatives to get f’(z) = a(1+2)°, f"(x) = a(a — 1)(1 4+ 2)*?, 
and so on with f(x) = a(a—1)...(a—n+1)(1+ 2)", which gives us that T(x) = 
= —1)(a—2)..(a-n+1 
Ss" ee S12) oe 2) ent Jan and T(x) = f(x) when T(z) is convergent and {R,,(x)} > 


n! 
n=0 


[—1, 1] (where s,, is the nth partial sum of the 


a(a—1)...(a—n)a*t n! 
(n+ 1)! a(a—1)..(a—-n+1)x” 


0. Since lim | | 
N—-+0o 

iy (he 
n—0o (n +1 
Ratio Test, but this does not show that the value T(z) converges to is necessarily equal to 
f (2). 

Note that the Taylor remainder becomes zero for large values of n (since f”) is zero) 
regardless of the choice of x if @ is a positive integer. 


x| = |x|, the series T(x) converges if |x| < 1 and diverges if |x| > 1 by the 


Assume a is not a positive integer. By Taylor’s Theorem (second form) we know that 

|Rn(x)| Z (max | f+ (£)| on 0, 2) ff" where [f+ (4)| = la(a _ 1)...(a _ n)(1 aa 

(n+ 1)! 

t)°-"-|. For all integers n > a, for x > 0 the maximum of |f'"t)(t)| is |a(a — 1)...(a — 

la(a —1)...(a —n)||a|"*1 
(n+ 1)! 


= x < 1, so choose a u € (0,1) and an integer k > a so that if i > k 


n)|. Thus, if « > 0 then we have |R,,(x)| < We know that 


zla—n| 
la(a —1)...(a — n)||2|"*4 
(n+ 1)! 


<u. Then for all n > k we have |R,(x)| < 


a a es u”—* which approaches zero as n approaches infinity. 


If —1 < x < 0 then we first observe that lim (2) xz” = 0. This is because 
noo n 


ice Me DOS Dele 2) n! ol) — lim ple ae = 
n—-¥00 (n+ 1)lar (n)a(a —1)..(a—n+1) noo on (n+1) 

|x| <1. Thus, by Theorem 8.12, jim no) = 0. 
Next, we observe that R,( -af F uane se —t)"dt = (n+ n(, ei / (1+ 


t)o-"!'(¢—t))"dt. To see this, aiieenitate? (1+t)* n+1 times and notice that a(a—1)...(a— 


n+l)!la(a—-1 —n a(a—1)..(a-—n 
( ot she ) which means that ( ak = 


n) can be written as 


m+n(,4,). 


245 


0 
—_ 
Finally, we observe that |R,(x)| = (n+ 1) & :) / G 7k + t)°~ldt. However, 


t= ; 
< |z| since if we differentiate this fraction with respect to 


if-l<a<t<0Othen 
1+t 
(1+t)—(t-<2) 
(1+)? 
t-—2£ 


this we conclude that ieee al < |x|" on [x,0]. Since (1 +t)! is continuous on [2, 0], 


> 0 so the maximum value occurs at t = 0 and is —x. From 


t we get 


it takes on a maximum value M on that interval (which does not depend on n). Thus, 


0 
|Rn(x)| < M(n+1) € n :) i |a|"dt < M(n +1) & :) |x|". By the observation above, 


4 a n _ 
Jim (n +1) € a. _ |x|" = 0, so |R,(x)| > 0. 
(f) We know that the nth derivative of e” is e* and e° = 1, so by Taylor’s Theorem, it 

a i x 

follows, for each n € N, that e* = Ss" eee e'(a — t)"dt. Also, we know that since e* 
i=0 a! n! () 

i ‘ ‘ A ‘ 1 is t ra |x| ak i 

is an increasing function, if z <0 then —| [| e(a—t)"dt| < , and if « > 0 then 

mn! Jo (n+ 1)! 

1 at etgnrtt n+1 

i e'(x —t)"dt < . Both of these approach zero since lim -———— = 0. Thus, 

n! Jo (n+ 1)! noo (n+ 1)! 

ge 8 
e=14+a+ + ... on (—o0, co). 


ee ee 

(g) The derivatives of f(x) = sin(x) follow the pattern f(0) = sin(0) = 0, f’(0) = 

cos(0) = 1, f”(0) = —sin(0) = 0, f’”(0) = —cos(0) = —1, f(0) = sin(0) = 0 and so 

on. Thus, multiplying each nth derivative at zero by a and adding the results as normal, 
n 


oo gonti fe GS 
we end up with a Taylor series T(x) = d iy" rer So ae a ep For this 
series, since the nth derivative of sin(a) has absolute value no larger than one, it follows 


(jae 
(n+ 1)! 


that |Rn(x)| < which converges to zero regardless of what x is, so T(x) = sin(z) 


for all real zx. 

(h) We differentiate each term of the Taylor series for sin(x) to get that the stated series 
is the series for cos(x) for all real numbers. 

(i) Since the series for e” and e ” converge everywhere we can obtain the series for 


cL _ p-2 
8 by subtracting the terms of these series and dividing by two, giving us sinh(x) = 
CO 2ntl 2 ee 
n=0 
oO en 
(j) Differentiating the preceding series term by term gives cosh(x) = ye (an)! =1+ 
n)! 
=0 
ge. “gt " 
atat on (—oo, 00). 


Note that in part (e), the series actually converges to the function (1+2)* at both end 
points ifa@>Oandatr=1lif—-l<a<0. 


246 CHAPTER 9. SEQUENCES OF FUNCTIONS 


Exercises: 


Exercise 9.1. Let {x,} be bounded and let {yn} > c > 0. Then limsup{ynrn} = 
climsup{zy}. 


Exercise 9.2. Let {x,} — p. Then limsup{z,} = p. 


Exercise 9.3. Let {x,} have no upper bound and let c> 0. Then limsup{cr,} = oo 
The next theorem actually uses the methods of sections five and six, but is useful here. 


Exercise 9.4. Prove that lim (n +j)a =1 (ifn+ j>0 for alln€N). 
n (oe) 


Exercise 9.5. Let f(x) = ee if x #0 and let f(0) =0. Prove that f has derivatives of 
all orders and f'™(0) = 0 for all non-negative integers n, but f is not analytic on any open 
interval containing 0. 


Exercise 9.6. Give an example of a sequence of functions {f,} — f on a set E so that 
each fy is bounded, but f is not bounded. 


Exercise 9.7. Prove that if a sequence of functions {f,} > f uniformly on a set E and 
each fy is bounded then f is bounded. 


Exercise 9.8. Give an example of a sequence of functions {fn} — f on [0,1] so that each 
fn is bounded and continuous except on a finite set, and f is bounded, but is not continuous 
at any point. 


Exercise 9.9. Let {f,} — f uniformly on [a,b] and let {gn} > g uniformly on [a,b]. 
Then {fn+9n} > f+ 9 uniformly on [a,b]. In particular, if gn = Cn and {cn} — c then 
{fn+cn} 7 f +e uniformly on [a,b]. 


247 


Solutions: 


Solution to Exercise 9.1. Let {x,,} be bounded and let {y,} > c > 0. Then limsup{yn2n } 
=climsup{zp}. 


Proof. First, note that sup{@n, @n41, @n+2,.-.} exists for each n € N (and is always greater 
than or equal to inf{x1,72,73,...}) since {ap} is bounded. Also, by an earlier exercise 
we know that if n > m then sup{%p,2n41,2n42,---} < sup{Lm, lm41,2n+2;---$, which 
means that {sup{2n, @n+1, 2n+2,...}} is a non-increasing sequence which is bounded below 
and converges. Also, we know that since {x,,} is bounded and {y,} converges, that both 
sequences are bounded, so we can find M,N > 0 so that || < M and |y,| < N for 
all n € N, which means that |z,y,| < MN for all n € N, so {xnyn} is bounded and 
lim sup{ynXn} exists. 

Set limsup{z,} = L, and choose B > 0 so that |r,| < B for alln € N. Let 6 > 0 
and then choose 7 € N so that if n > j then |y, —c| < 6. Then cry, — 6|tn| < tnyn < 
CLp, + O|ap|. Thus, sup{tnyn, 2n41Yn41, Ln42Yn42,---} < sup{cryn + 6B, ctn41 + OB,...} < 
csup{i@n, Un41, Un+2, -.-}+dB by earlier theorems. Thus, Jim SUD Crns PAU ne eA | 


cL. Similarly, lim sup{@nYn, Ln41Yn+1,;---} > cL, so the result follows. 
noo 


Solution to Exercise 9.2. Let {x,}— p. Then limsup{z,} = p. 


Proof. Using exercise 9.1, this result follows immediately, setting the bounded sequence to 
be {1} in the theorem statement. 
A direct proof is also straightforward however: 

€ 
Let € > 0. Choose k € N so that ifn > k then |z,—p| < 37 80 p-€ = Sup etait yeh S 
€ 
) 


pt = < pte, so |sup{%n, 2n41, 2n+42,---}—p| <e. Thus, {sup{%p, 2n41, 2n42,---$} > p. 


Solution to Exercise 9.3. Let {x,} have no upper bound and letc > 0. Then limsup{cx,} = 
ore) 


M 
Proof. Let M > 0. There is an n so that x, > —, which means that cx, > M. Thus, 
Cc 


{cx,} is not bounded above, so limsup{cx,,} = co. 


i 
n 


Solution to Exercise 9.4. Prove that lim (n+ Jj)» =1lifn+j>0. 
nN—- Oo 


1 1 
Proof. Taking the log we have lim In((n + j)'—) = lim —In(n +). Using L’Hosptial’s 


1 
Rule, this is lim - = (0. Since e” is continuous and the inverse of In(a) we have that 
n>00 N+ 7 


: adhere a. a 
Jim exp(in((n + §)")) = lim (n+ j)n =e = 1. 


248 CHAPTER 9. SEQUENCES OF FUNCTIONS 


A: 
Solution to Exercise 9.5. Let f(x) =e «= ifx 40 and let f(0) =0. Prove that f has 
derivatives of all orders, but is not analytic on any open interval containing 0. 


Proof. Each derivative of f (nr) (x) is a rational function multiplied by ex everywhere except 
p(z) 
(x) 


1 
Je «2, so the statement follows inductively. 


-1 
ex then 


at x = 0. Tosee this, note that this is true for n = 0 and if the kth derivative is 


peg) (Pe) 2 a(x)p'(x) — p(x)q'(z) 


g(a) #3 (q(a))? 
(c) 4 eGje # — 0 
At « =0, let f™(x) = Pw ea for & #0. Then we have f((0) = lim ic 
q(x) a0 06h —0 


Z) A F : : : papal st att L 
= a ve x2, which is a rational function times e 22. For any positive integer m is true 
xq(x 


1 _ 
that lim, —-e oe lim ue“ by Theorem 4.14. By L’Hospital’s rule we know that 


z>0t & uU—>oo 

ym 
lim —- = 0 since differentiating the numerator m times leaves a derivative of m! and 
uo € 


m 


u 
uw < <_ for all 
eu 


m! 
differentiating the denominator still leaves e“ and lim — = 0. Since u'e~ 
u—-oo eu 


u> 1 it follows that lim u™e~“ =0 by the Squeeze Theorem for extended real numbers. 
U—-> Ooo 


Ay 3 
Similarly, it follows that lim ——e we 0. 
x0- @™ 
If we choose m larger than the degree of xq(x) then lim ——~ = 0 since if k is the power 
+0 xq(x) 


of the lowest power summand C* in xp(x) then dividing the numerator and denominator by 
az* gives us a numerator x’””~* which approaches zero, and a denominator which approaches 
. 


=1 lt == . , 
ex? < —e = for x sufficiently close to zero, it 
q(x) “™ 


C' as x approaches zero. Since 0 < 


follows from the Squeeze Theorem that lim _-__¢7 3? =0. We also know that lim pt) = 
«20 xrq(x) «20 


p(0). Thus, by the product rule for limits f‘”)(0) = lim Pz) 
«0 rq(x) 

Thus, all derivatives of all orders for f(x) are zero at x = 0, so the Maclaurin series for 
f(a) is valid only at a single point. 


e- = =p(0)(0) =0. 


Solution to Exercise 9.6. Give an example of a sequence of functions {fn} > f on a set 
E so that each fy, is bounded, but f is not bounded. 


1 
Proof. Let fr(x) = al on (0,1]. Each function is bounded between 0 and n, but the 


n 


1 
limit of the sequence of functions is f(a) = — which is not bounded. 
£ 


Solution to Exercise 9.7. Prove that if a sequence of functions {f,}— f uniformly on 
a set E and each fn is bounded then f is bounded. 


249 


Proof. Choose k € N so that ifn > k then |f(x) — f,(ax)| < 1 for all e € E. Choose M > 0 
so that | f,(a)| < M for all x € EF. Then |f(x)| < M+1 for all x € E, so f is bounded. 


Solution to Exercise 9.8. Give an example of a sequence of functions {fn} — f on [0,1] 
so that each fp, is bounded and continuous except on a finite set, and f is bounded, but is 
not continuous at any point. 


Proof. Order the set of rational numbers in [0, 1] as Q = {q1, @, q3, ...}. Then define f, (a) = 
lis x € {q1, G2, 93, +--+; Qn} and f,(x) = 0 otherwise. Then {f,} > f where f(x) = lifx EQ 
and f(x) = 0 otherwise, which is a function which is bounded and everywhere discontinuous. 


Solution to Exercise 9.9. Let {f,} > f uniformly on [a,b] and let {g,} > g uniformly 
on [a,b]. Then {fntg9n} > f+g uniformly on [a,b]. In particular, if gn = Cn and {cn} > c 
then {fn +¢n} > f +c uniformly on {a, b]. 


Proof. Let € > 0. Choose ki,k2 € N so that if n > ky then |f,(x) — f(x)| < 5 for all 
x € [a,b] and if n > kg then |g,(x) — g(x)| < 5 for all x € [a,b]. Then ifn > max{ky, ko} it 


follows that |(fn(2) + gn(x)) — (f(@) — 9(@))| < [fnle) — F(@)| + lgn(@) — 9(2)| < 5 +5 
so {fn + gn} > f +g uniformly on {a, }}. 

Finally, let gn(x) = cn on [a, b] for each n € N and let g(x) = c on [a,b], where {cn} > c. 
Then for any « > 0 we can find k € N so that if n > k then |cp, — cl = |gn(x) — g(x)| < € 
(for all x € [a,b]), so from the previous paragraph we see that {fn(x) + cn} > f(x) +e 
uniformly. 


=€, 


Chapter 10 


Structure of Euclidean Space 


We are assuming that the reader is familiar with linear algebra at this point, but many 
readers are not. We have placed a section on matrices in the Supplementary Materials for 
Multiple Variables that introduces the relatively small amount of linear algebra needed for 
this text. 


Definition 66 


The space R” consists of all points which are n-tuples (x1, %2,..., 2%) in the n-fold 
Cartesian product R x R x... x R. A vector in R” is a directed line segment. We 
use the notation x to denote the vector < 21, %2,...,% > which is the directed line 
segment which, if placed with its base (beginning point) at the origin, its end will 
be the point (21,272, 23,...,2%n). We consider all translations of x to be equivalent 
vectors to x, all of which are written as x. If there is no context to indicate which 
translation (determined by which starting point) for a vector is to be used, the vector 
symbol normally indicates the vector based at the origin. We interchangeably use the 
notation x to refer to the point (21, £2, %3, ..., Zn) € R”, the vector < 11, %9,...,&p, > or 
the row or column matrix whose entries are the entries of the vector < 21, 29,...,2n >, 
depending on context. Furthermore, we may also use the notation (x1, 29, ...,%n) 
to refer to the vector rather than the point and < 71,2%2,...,%, > to refer to the 
point rather than the vector. We add two vectors using the convention for vectors 
xX =< %1,%2,...,2n > and y =< yj, y2,..-, Yn > and real numbers (also referred to as 
scalars) a, 3, we define ax + By =< az, + By1, a%2 + By2,...,A%n + BYn >. More 
formally, the n-tuple vector x can refer to any line segment of the form L(a,a+x) = 
{a+tx € R"|t © [0,1]}. The base of the vector is a, obtained by plugging in 
t = 0, and the terminal point or end of the vector is a+ x, obtained by plugging 
int=1. If x = L(a,ja+x) = L(b,b+x) then we refer to these line segments 
as translations of one another. The fact that x =< 21, %2,%3,...,%n > means that 
when a = 0 =< 0,0,0,...,0 > in this definition, the terminal point of the vector 
is < %1,%9,...,%, >. In some books we refer to the class of all such segments as 
the vector and a specific directed line segment as a particular realization of the 
vector. We will refer to different directed line segments as the same vector if they 
are translations of each other. We refer to the line through a and a+ x as being 
(a,x) = {a+tx € R"|t € R}. 


250 


251 


Notation: While there is sometimes value in distinguishing between a vector list of 
coordinates and a list of coordinates considering an n-tuple as a point, we will normally not 
do so. So, (#1, %2,23,.-.,%p) could refer to the vector or point (or column or row matrix) 
with the listed entries, as could < 11, 2%9,...,%, > but in this text we will usually mean the 
vector when we write < 21, %9,...,% >. In general, if a vector is listed with a bold letter 
label (such as p) in an argument then it is understood that the non-bolded letter indexed 
with a number i (in this case p;) refers to the ith entry of p unless otherwise stated. 


Definition 67 
We define the norm, magnitude or length of vector x =< %1,%2,...,%n > to be 


n 
Ss x?. We define the distance from x to y to be |x — y]. 
i=1 


We define the dot product of vectors x =< 21, %2,...,% > and y =< 41, Y2,---5 Yn > 
n 


to bex-y = ue Note that x -x = |x|?. We define the angle between two 
k=1 


vectors x and y to be cos ! (—%). 
Ix|ly| 


While we have defined the angle and distance as stated, there are geometric motivations 
for these choices, which we briefly discuss. These rely on properties of geometry that would 
take us a bit on a tangent, so these are more descriptions than proofs based on what we 
have shown. It does not do any harm to simply use these as definitions in this development, 
though. 


The distance between two points in three dimensions can be found using the Pythagorean 
theorem twice, drawing a box with the points (x1, yi, 21) and (2, y2, z2) at opposite corners, 
and for simplicity we will assume that x; > x2. If we look at the front face of the box wherein 
we have a rectangle with fixed x coordinate then the diagonal of that rectangle connects 
(11, y1, 21) and (24, yo, 22) has length | = VJ (y2 — yi)? + (zg — 21)?. Then the diagonal 1 is 
perpendicular to the edge of the cube at its end point connecting (j, y2, z2) and (2, ye, 22). 
Thus, using the Pythagorean theorem again we get that the distance from (21, y1, 21) to 
(2, y2, 22) is the length of the hypotenuse of the triangle with vertices (x1, y1, 21), (#1, y2, 22) 
and (2, y2, 22), which is d = \/(rq — 1)? + I’. Substituting for / we motivate the following 
definition for distance, which can be generalized. 


252 CHAPTER 10. STRUCTURE OF EUCLIDEAN SPACE 


Distance in Three Dimensions 


From this distance formula, it is possible to derive a formula for the equation for a sphere. 
The set of points (x,y,z) on the sphere with center (x1, yi, 21) and radius r would be the 
set of points satisfying the equation \/(x — #1)? + (y— y1)? + (2 — 21)? =r. Squaring both 


sides we get the following equation for a sphere: (a — #1)? + (y— y1)? + (2 — 21)? =r”. 


Theorem 10.1. Let u,v, w € R”, and let a€ R. Then: 
(a) w-v=v-u 
(b) u-(v+w)=u-v+u-w 
(c) (au): v=a(u- v) 


Proof. Let u=< 1, U2, U3, ..,Un >, V =< V1, V2, U3, ---,Un >, and w =< Wj, We, W3,..-,Wn >. 

(a) U-V = Uv) + Ugde + UZgU3 +... + UnUn = VLU + VeUg + U3U3 + ...UnUn = V- UL 

(b) u-(v+w) = ur(ur + wi) + Ue(ve + we) + ug(v3 + wz) +... + Un(¥n + Wn) = 
Uz{V1 + UQV2 + UZV3 + ULW1 + U2W2 + UZW3 +... + UnUn + UnWn = U'V+uU-w. 

(c) (au) - Vv = au, + augv2 + augy3 +... + AUnUn = a(uzv1 + Ugve + UZU3 + ..-UnUn) 
=a(u-v). 


Without the geometric development to justify things like what an angle means this 
argument isn’t rigorous, so we will define angle in terms of dot product instead for purposes 
of this text. 


253 


Definition 68 


u-v 
Let u, v be vectors in R” we define the angle between u and v to be 6 = cos! —— 


Jully| 


It would be nice to see that the angle we have defined matches what would be understood 
to be the angle from geometry classes. The following theorem addresses that idea. It might 
be a mistake to call it a theorem because we have not done the geometric development 
that is assumed (essentially, we are showing that our definition of angle is the same as one 
that has not been properly characterized with geometric axioms but which is intuitively 
understood by most readers). 


Theorem 10.2. Let u, v be non-zero vectors (based at the origin). Then the smallest angle 


z . 
between u and v is cos! ( y i: 
|x| y| 


Proof. The vectors u, v and u — v form a triangle (with the last vector positioned to 
start at the end of v and end at the end of u). Using the Law of Cosines, we know 
2 2 2 
u —|u- 
oy ala bl . Thus, cos(@) = 
2/ul|v| 
u-u+v-v—(u—v):(u—v) u-u+v-v—u-u—v-v+2u-v 


= by Theorem 10.1, so 
2/ul|v| 2/ul|v| 


Jul|y| 


that if 6 is the angle between u and v then cos(@) 


cos(0) = 


, as desired. 


Definition 69 


Let u, v be vectors in R”. We define the scalar projection comp,u of vector 
u onto the direction of vector v to be |ul|v|cos(@) where @ is the smallest angle 
between the two vectors. Likewise, a vector in the direction of vector v having this 
length is called the vector projection of vector u in the direction of vector v and is 


ie Ele 
proj,u = Seen o dle : 


We say that non-zero vectors u and v are perpendicular or orthogonal if u-v = 0. 
The zero vector is orthogonal to all vectors but is not perpendicular to any vector. 
Two vectors a and b are parallel if a = kb for some non-zero scalar k. 


Note that two vectors are perpendicular if and only if the angle between the vectors 
T 
is cos '(0) = =. The plane perpendicular to a vector < a,b,c > containing the point 


< Xo, yo, Zo > is the set of points (x, y, z) so that < x — 20, y — yo, Z — 20 > is perpendicular 
to <a,b,c>. 


Theorem 10.3. The plane P containing the point (x0, yo, 20) which is perpendicular to the 
vector (a,b,c) has equation a(x — xo) + b(y — yo) + c(z — 20) = 0. 


254 CHAPTER 10. STRUCTURE OF EUCLIDEAN SPACE 


Proof. A point (z,y,z) € P if and only if < x — 29,y — yo, z — Zo > is perpendicular to 
< a,b,c >, which is true if and only if < a,b,c > - < %©—2%0,y — yo, 2 — 2 >= 0, which is 
true if and only if a(x — 29) + b(y — yo) + c(z — 2) = 0. 


Definition 70 


Let u, v be vectors in R”. A parallelogram, two of whose sides are vectors 
u and v, both based at one vertex p of the parallelogram, is the set of all points 
{p+au-+ bv +|a,b € [0,1]}. A parallelpiped whose edges are vectors u, v and w, all 
based at the same vertex p, is the set of all points {p + au+ bv + cwla, b,c € (0, 1]}. 
A line passing through point p in direction u (the line passing through p and p+ u) 
is (p,p+u) = {p+tul|t € R}. The angle between lines in directions u and v is 
the the smallest of the angle between u and v and the angle between u and —v. If 


the angle between two lines is zero then the lines are parallel. If two lines are not 
parallel in R® and they also do not intersect then they are said to be skew. We use 
the notation det(uj, u2,...,U,) to denote the determinant of the matrix whose ith 
row is the vector u; € R”. 
If u =< uj, u2,u3 >,V =< v1, 02,03 >€ R® then the cross product u x v = 
ij k 
det |u, ug U3) =< uguv3 — UZ3V2, UZV1 — U1{V3, U1LV2 — UQU1 >. 
Vt V2. U3 


Some of what we do in multivariable calculus will be focused specifically on R® and 
surfaces contained in R®, and the cross product is helpful for many theorems in Euclidean 
three space. While part of the next theorem relies on geometry, we will be able to prove those 
results below that are dependent on geometric principles more rigorously with theorems we 
will prove when we cover integration. 


Theorem 10.4. Let u =< uj,uU2,u3 > and v =< vj1,v2,03 >and w =< wy, wW2,W3 >. 
Then: 
wi, we we 
(a) ux v is the unique vector so that for every vector w € R?3, it is true that|u, uz ug 
U1 v2 U3, 
=w-(ux v). 
(b)uxv=-vxu 
(c) u and v are perpendicular to ux v 
(d) |\w x v| = |ul|v| sin(@) where 6 is the angle between u and v. 
(e) The area of a parallelogram, two of whose sides are three coordinate vectors vectors 
u and v both based at one vertex of the parallelogram, is |u x v|. If u,v are vectors in R? 
then | det(u, v)| is the area of the parallelogram with those vectors as edges. 
(f) The volume of a parallelpiped, three of whose edges are vectors u, v and w, all based 
at the same vertex of the parallelpiped, is |w-(u x v)| = | det(w, u, v)]. 
(9) If w= ux v then v=wx u. 


255 


W1 W2 W3 
Proof. (a) Simply using the definition, }u, ug us| = wi 
Uy V1 U3 
= w1(ugv3 — U3V2) + W2(U3v1 — U1V3) + W3(U1v2 — U2v1) = W- (ux v). The fact that this is 
the only vector having this property follows from taking the dot products of u x v with the 
component vectors i,j,k since these dot products would give us that the first, second and 
third components of a vector having the property that for every vector w =< w 1, w2,w3 >€ 
W1 W2 W2 
R?, it is the true that }u; ug u3| =w- (u x v) are exactly ugv3 — ugv2, Ugv1 — U1v3 and 
U1 V2 U3, 
u,v — Ugv, respectively. 


U2 U3 
U2. U3 


Ul U2 


+W3 
Ul V2 


U1 U3 
V1 U3 


(b) Since determinants are negated by switching two rows of a matrix, this follows 
from part (a). Another argument would be simply applying the definition, since v x u 
=< UQUZ — UZU2Q, UZU1 — ULUZ, VLUZ2 — V2U, > = — < UQUZ — UZV2, UZVI — ULVZ, ULV2Z — UQVL >= 
—u X Vv. 

(c) This follows from the fact that u-u x v is the determinant of a matrix whose first 
two rows are u, which is zero. Alternately, we plug into the usual formula to get that 
u-u xX Vv = Uy(Uav3 — UgV2) + U2(U3v1 — U1V3) + U3(u1v2 — Ugv1) = 0 and v-uxv = 
v1 (u2v3 — Ugv2) + V2(ugv1 — U1v3) + v3(uLv2 — U2v1) = 0 directly through cancellation. 

(d) By definition, we have |u x v/? = (ugv3 — u3v2)" + (u3v1 — u1v3)" + (uyv2 — ugv,)" 
= uve +uzus ture? +uzv3 tutes +uu7 —2(ugu3vev3t+u1ugv1v3t+u1u2v1V2). Also, |ul?|v|? = 
(uj +u3+u3) (vj +03 +03) = upop +uze3 + ugu3 +upe3 + ujug +uzvy + upu3 +ugvj + ugvg, and 
(u- v)? = (uzv, + ugve + u3v3)" = uru? + usve + uzu3 + 2(ugu3zv2v3 + U1U3v1U3 + U{U2U1 V2). 
Thus, [ux v[2 = jul2|v[2 — (w--v)? = [ul2|v[? — jul2|/? cos?(8) = fual2|v/2(1 — cos?(8)) = 


\u|?|v|? sin?(@). Taking the square root of both sides gives |u x v| = |ul|v| sin(@). 


(e) The area of a parallelogram is the product of the length of adjacent sides of the 
parallelogram times the sine of the angle between them. In this case, that is |u||v|sin(@) = 
|u x v| by part (d). In the case of two coordinate vectors < ui,u2 > and < v4,v2 > the 
area of the parallelogram with those vectors as sides is the area of the parallelogram with 
sides < uyz,u2,0 > and < v1,v2,0 >, the cross product of which is < 0,0, uyve — ue, >, 
the norm of which is |ujv2 — ugvi| = det(u, v). 

(f) Intuitively, the volume of a parallelpiped is just its height times the area of its base 
since every cross section parallel to the base parallelogram is a congruent parallelogram 
with the same area. In other words, if we view two of the vectors as vectors within the 
base of the parallelogram (say u and v) it is the projection of the direction of the third 
vector w onto the direction perpendicular to this base (u x v) times the area of the base 


parallelogram which is the area of the parallelogram. Since the area of the base is |u x v], 
uXxv 


ju x v| 


we use the projection formula to get that the volume is |w- \Ju x v| =|w- (ux v)| 


Wi W2 W3 
= |/u, u2 usl|| = |det(w,u,v)|. This result is also proven later (as a consequence 
UL UL U3, 
Theorem 12.61, for example) without relying on the notions Cavalieri’s principle addressed 
above. 

(g) By part (a) we know that v-w x u = det(v, w, u) and that w- (ux v) = det(w, u, v). 
Since switching rows negates the determinant, two row switches performed consecutively 
leaves the determinant unchanged, from which we see that v- w x u= w- (ux v). 

Since some students may not have reviewed linear algebra recently, we can do additional 


256 CHAPTER 10. STRUCTURE OF EUCLIDEAN SPACE 


algebra to verify that det(v, w, u) = v1(wau3 — w3u2) + v2(w3u1 — w1U3) + ¥3(wiu2 — W2U1) 
= u4(veW3 — v3W2) + U2(v3w1 — V1 W3) + Ug(viwW2 — W1v2) = det(u, Vv, W). 


Theorem 10.5. Cauchy-Schwarz Inequality. Let x,y € R”. Then |a- y| < |a||y|. 


* ey) -( 


y 2 
y y) =|x|“—2 
ly|? ? 


Proof. Since 0 < |x — ae = (x = 
ly| ly| 


Thus, |x - y|? < |x|?|y|? and |x - y| < |xlly]. 


Theorem 10.6. Triangle Inequality. Let x,y € R”. Then: 
(a) |x+y| <|a} + yl. 
(b) |x— y| = |a| — ly 


(c) If x, ER” for alll <i<m then S~ |a| > yal: 
i=1 i=1 


Proof. (a) We know that 0 < |x +y/? = (x+y): (x+y) = |x|? 4+ ly? +2x-y < 
|x|? +|y|? +2|x||y| = (|x|+]y|)? by the Cauchy-Schwarz Inequality. Thus, |x-+-y| < |x|+]y]. 
(b) By (a) we know that |y + (x — y)| < |y| + |x — y| so |x — y| 2 |x| — yl. 
(c) We proceed by induction. This is true ifm = 1. Assume it is true form =k EN. 


k+1 k k 
Then | So xi| . [Ss + Xp4i| < | Sox: + |xz41| by (a), which is less than or equal to 
i=1 i=1 i=1 
k k+1 k+1 
S> |x;| + |xz~41| by the induction hypothesis, so | ee < > |x;|. The result follows for 
i=1 i=1 i=1 


all natural numbers ™. 


Some theorems are more helpful to prove in the general setting of metric spaces so that 
they can be used for other purposes, so we also define metric spaces before we start proving 
theorems about calculus in higher dimensions. 


Definition 71 


A set of points X with a distance function p: X x X — [0,00) (called a metric 
for X) is a metric space provided that: 


() p(@,y) =O and only ifo—y 
(2) Symmetry: p(x, y) = p(y, x) for all z,y € X 
(3) Triangle inequality: p(x, z) < p(x, y) + p(y, z) for all x,y,z € X. 


Theorem 10.7. R” is a metric space under the metric p(x, y) = |x— yj. 


257 


Proof. The distance norm p(x,y) = |x — y| satisfies all of the properties of a metric. 
Properties (1) and (2) are immediate from the definition since the difference of two entries 
squared is the same no matter which term is subtracted from which, and is only zero if the 
entries are equal. The triangle inequality was proven, so the result follows. 


Definition 72 


In a metric space (X,d) we define an open ball about p of radius « > 0 to be 
B.(p) = {x € X|p(x,p) < €}. We also define the closed ball about p of radius « > 0 
to be B.(p) = {x € X|p(z,p) < e}. A set A is open in X (or just open) if for 
every point p € A there is an open ball containing p which is contained in A. The 
complement of an open set is closed. We say that q is a limit point of set A if every 
open ball about g contains a point of A distinct from g. A point of A that is not a 
limit point of A is an isolated point of A. A set U is open in aset EF ifU=VOE 
for some open set V, and A is closed in E if A= C1 E for some closed set C. A 
set S is bounded if there is some M > 0 so that S C Bys(p) for some p € X. The 


diameter of a set S C X is sup p(x,y) if S is bounded, and infinity otherwise. 
z,yEes 
Let (X,d), (Y,p) be a metric spaces, D Cc X. A sequence of points in X is a 


function g : N > X, where we usually denote the image of the nth integer as a letter 
subscripted with n, so we might write g(n) = x, for instance. We typically use {x,,} 
or {%n}nen as the notation for such a function g rather than the symbol g itself. We 
say that {x,}— p in X if for every « > 0 there is some k € N so that if n > k then 
d(atn,p) < €. We say that f : D— Y is continuous at c € X if for every € > 0 there 
isa 6 > 0 so that if « € D and d(z,c) < 6 then p(f(z), f(c)) < «. Let p be a limit 
point of D. We say that hut f(x) = L if for every € > 0 there is a 6 > 0 so that if 


0 < d(x,p) <6 and xz € D then p(f(x), L) <. 


Note that since an open set and its complement are open and closed respectively, 
their intersections with E are open in & and closed in E respectively, meaning that the 
complement (in E) of a set which is open in F is closed in EF and the complement in EF of 
a closed set in £ is open in E. 


Theorem 10.8. Let (X,d) be a metric space and let p € X and lete > 0. Then B,(p) is 
open and B.(p) is closed. 


Proof. Let q € B-(p). Set 6 = €—d(p,q). Let z € Bs(q). Then d(z,p) < d(z,q)+d(q,p) < € 
by the triangle inequality, so z € B.(p), which means that Bs(q) C B.(p), so B-(p) is open. 

Let g € X \ B.(p). Then d(p,q) > «. Set 6 = d(p,q)—e. Let z € Bs(qg). Then 
d(p,z) > d(p,q) — d(z,q) > € by the triangle inequality, so Bs(q) C X \ B-(p), which means 
that X \ B.(p) is open so B,(p) is closed. 


298 CHAPTER 10. STRUCTURE OF EUCLIDEAN SPACE 


Theorem 10.9. Let {x;} be a sequence in R™, where each a = (%i,,Xiz,---,Li,,). Then 
{x;}— p = (p1,p2,---;Pm) if and only if {xi} — p; for each 1 <j<m. 


Proof. First, assume that {x;} — p. Then let « > 0. Choose k so that if i > k then 

|x; — p| <. Since |x; — p| > |x; — pj| we know that |x;, — p;| < € so {xj,} > pj. 
Conversely, if we assume {x;,} — pj; for each integer j € {0,1,2,...,m}. Then we 

aes for each 1 < 7 < m. Thus, 


J/m 


can choose k € N so that if i > k then |i; — p;| < 


2 
S- ar Hence, {x;} > p. 
are 


|x; — p| < 


Theorem 10.10. Let (X,d) be a metric space with c € X, and let f : D— Y bea 
function, where (Y,p) is a metric space and DC X. Then lim f(x) = L af and only if 
xu Cc 


lim p(f(x), L) = 0. 


Proof. Let « > 0. First, assume that lim p(f(x), 2) = (0. Since the function p(f(x), L) : 
D — R has a limit at c, it follows that c is a limit point of D. Choose 6 > 0 so that 
if 0 < d(a,c) < 6 then |p(f(x), LZ) — 0| < «. Then it follows that p(f(x),L) < €, so 
lim f(a) =, 

eo Aen lim f(z) = L. Then c is a limit point of X. Choose 6 > 0 so that if 0 < 
Ade 25 then Dae By Sec hen UF). DSS eds lim p(f(z),L) =0. 


Theorem 10.11. Squeeze Theorem (for metric spaces) 

Let f,g: DY where (X,d) and (Y,p) are metric spaces and DC X. 

(a) If lim p(g(), £) = 0 and p(f(x),L) < p(g(x),L) for all x € Bs(c) for some 6 > 0 
then lim f(x) = L. 

(b) If {zn} > 0 in R and d(Zn,p) < tp for alln > k for some k € N then {zn} > p in 
Xx. 


Proof. (a) We know c is a limit point of D since lim g(x) = L. Let e > 0. We can choose 
a0 <¥< 6 so that if d(z,c) < y then p(g(x), L) < € for all x € D, which implies that 
p(f(a),L) < 80 lim f(z) = L. 

(b) Let « > 0. Choose t € N so that t > k and if n >t then |x, — 0| < «. Then ifn >t 
we know that d(zn,p) < In < € 80 {Zn} > p. 


Theorem 10.12. Let A be a subset of a metric space X. Then p is a limit point of A if 
and only if there is a sequence of points in A \ {p} converging to p. 


Proof. First, assume there is a such a sequence. Then for every € > 0, for sufficiently large 
n we know that x, € B.(p), which means that p is a limit point of A. 

Next, assume that p is a limit point of A. Then for each n we can choose a point 
Dn € Bi(p)NA\ {p}. Then {pr} — p by the Squeeze Theorem. 


259 


Theorem 10.13. The Sequential Characterization of Limits for functions on metric spaces. 

Let f: D> Y, a metric space with metric p, where D C X, a metric space with metric 

d, and c is a limit point of D. Then lim f(x) = L if and only if for every sequence 
wL->C 

{tn} GC D\ {c}, if {an} + ¢ then {f(an)} > L. 


Proof. First, assume that lim f(x) = L. Let {zn} C D\ {c} such that {rn} > c. Let € > 0. 
wc 


Then for some 6 > 0, we know that if 0 < d(xz,c) < 6 and x € D then p(f(x), L) < e€. Since 
{tn} — c, we can find k € N so that ifn > k then d(ap,c) < 6, and since {r,} C D \ {c} 
it follows that ifn > k then 0 < d(ap,c) < 6. Hence, ifn > k then p(f(xrn), L) < €, so 


{f(&n)} > L. 
Next, assume that for every sequence {z,} C D \ {c}, if {an} > c then {f(z,)} > L. 
Suppose that it is false that Tim f(x) = L. Then we can find an € > 0 so that for every 


56 > 0 there is some x € Dt ie so that d(x,c) < 6 but p(f(xz),L) > «. For each n € N 
1 

we choose x, € D \ {c} so that d(apy,c) < and p(f(an),L) > «. Then {z,} > c by the 

Squeeze Theorem, but {f(x,)} 4 L, contradicting our assumption. 


Theorem 10.14. The Sequential Characterization of Continuity for functions on metric 
spaces. Let f : D— Y, a metric space with metric p, where D C X, a metric space with 
metric d, andc € D. Then f is continuous at c if and only if for every sequence {x,} C D, 


if {tn} c then {f(an)} > flo). 


Proof. First, assume that f is continuous at c. Let {x,} C D such that {x,} > c. Let 
€ > 0. Then for some 6 > 0, we know that if d(a,c) < 6 and x € D then p(f(z), f(c)) <e. 
Since {z,} > c, we can find k € N so that ifn > k then d(x,,c) < 6. Hence, ifn > k then 
Of (an), Fle) < € so {f(an)} > fle). 

Next, assume that for every sequence {xn} C D, if {an} > c then {f(an)} > f(c). 
Suppose that f is not continuous at c. Then we can find an € > 0 so that for every 6 > 0 
there is some x € D so that d(x,c) < 6 but p(f(x), f(c)) > «. For each n € N we choose 


1 
Ln € D so that d(apn,c) < — and p(f(rn), f(c)) > «. Then {x,} — c by the Squeeze 
Theorem, but {f(z,)} 4 f(c), contradicting our assumption. 


Theorem 10.15. Let f : D—> Y, where D C X, a metric space with metric d, Y is a 
metric space with distance function p andc € D. 
(a) Let c be a limit point of D. Then f is continuous at c if and only if lim f(x) = f(c). 
xwL—->C 


(b) If c is an isolated point of D then f is continuous at c. 


Proof. (a) First, assume that f is continuous at c. Let « > 0. We know that for some 6 > 0, 
if d(x,c) < 6 and x € D then p(f(), f(c)) < ¢. Hence, if 0 < d(z,c) < 6 and x € D then 
(f(x), Fle)) <¢, so lim f(x) = f(c). 


260 CHAPTER 10. STRUCTURE OF EUCLIDEAN SPACE 


Next, assume that lim f(z) = f(c). Let « > 0. We know that for some 6 > 0, if 
xL—->C 


0 < d(a,c) < 6 and x € D then p(f(zx), f(c)) < ¢, but ifc =z then p(f(z), f(c)) =0 < eas 
well. Hence, if d(x,c) < 6 and x € D then p(f(x), f(c)) < €, so f is continuous at c. 

(b) Since c is an isolated point of D, we can find 6 > 0 so that the only point D 
whose distance from c is less than 6 is c. Hence, if d(a,c) < 6 and x € D then x = ¢, so 
p(f (x), f(c)) = 0 which is less than any positive number ¢€ and therefore f is continuous at 
é. 


Theorem 10.16. Let f: X > Y andg:Y — Z be functions where X,Y,Z are metric 
and p € dom(go f). Let f be continuous at p and g be continuous at f(p). Then go f is 
continuous at p. 


Proof. Let {%,}— p, where {x,} C dom(gof). Then {f(2,)} > f(p) and so {g(f(an))} > 
g(f(p)), which implies that go f is continuous at p. 


Theorem 10.17. Let f : D— R™ where D C R", and f(x) = (fi(a), fo(a),..., fm(x)) 
where each f;(z) : IR" > R and c is a limit point of D. Then 
(a) lim f(a) = L= (1, L2,..., Lm) if and only if lim fila) = Ty for each 1.7 m: 
x c x c 
(b) f is continuous at p € D if and only if each component function f; is continuous at 
p. 


Proof. (a) By the Sequential Characterization of Limits we know that lim f(x) = L if and 
x—->C 


only if for every sequence {x;} C D \ {c} so that {x;} > c it is true that {f(x;)} > L. 
Choose a sequence {x;} C D \ {c} so that {x;} > c. 

By Theorem 10.9, {f(x:)} > L if and only if {fj(x;)} ~ L; for alll <j <m. 

By the Sequential Characterization of Limits, for all 1 < j < m, {fj(xi)} - L; for each 
sequence {x;} C D \ {c} so that {x;} > c if and only if jim fj (x) = L;. Hence, the result 
follows. 

(b) By the Sequential Characterization of Continuity we know that f is continuous at p 
if and only if for every sequence {x;} C D so that {x;}— p it is true that {f(x;)} > f(p). 

By Theorem 10.9, {f(xi)} > f(p) if and only if {f;(x:)} > f(p); for all 1 <j < m, 
where f(p); denotes the jth component of f(p). 

By the Sequential Characterization of Continuity, we also know that {fj(xi)} > f(p); 
for all 1 < j < m for every sequence {x;} C D so that {x;} — p if and only if fj is 
continuous at p for all 1 <7 <m. Hence, the result follows. 


Theorem 10.18. Let (X,d) be a metric space, let DC X and let c be a limit point of D. 
Letg: DR, and lim g(x) =t £0. Then there is a6 > 0 so that if x € Bs(c) ND then 
x Cc 
[t| 


glx) > a and c is a limit point of dom(g). 


261 


t 
Proof. We can find 6 > 0 so that if d(x,c) < 6 then |g(x) — t| < _ which means that 
t 1 2 
\g(x)| > a and therefore @ = ig > 0. Hence if U is any open set containing c then 
g(x 


Bs(c) OU contains a point q distinct from c which is in D, and therefore in the domain of 


— since g(q) #0. Thus, c is a limit point of dom(g). 
g 


Theorem 10.19. Let (X,d) be a metric space, let DC X and let c be a limit point of D. 
Let f,g: DR, and lim (2) =s and lim Qt) = te Lien: 
(a) lim af(x)+ 8g(x) =as+ Bt for any real a, 8. 
(b) lim f(x)g(a) = st. 
ae C3 eee 
Proof. Let {xn } be a sequence in D\{c} which converges to c. By the Sequential Characterization 
of Limits we know that {f(a,)} — s and {g(x,)} — t. Hence, {af (an)+8g9(an)} 4 ast+{t, 


{f(an)9(an)} > st, and if {x,} C dom(£) then (Sesh) > 7 By Theorem 10.18, we 


know that c is a limit point of the domain of — if t 4 0, and therefore by the Sequential 


Characterization of Limits we conclude that (a), (b) and (c) are true. 


Theorem 10.20. Let (X,d) be a metric space, let DC X and letc € D. Let f,g: DOR 

be continuous atc. Then fg and f +g are continuous atc, and f is continuous at c if 
g 

g(c) #9. 


Proof. By Theorem 10.19, we know that if c is a limit point of D then fg and f +g are 
continuous at c by Theorem 10.15, and if c is a limit point of D and g(c) 4 0 then cisa 


limit point of the domain of — by Theorem 10.18 and so = is also continuous at c. 
g g 
If c is not a limit point of D, then c is not a limit point of the domains of f +g or fg 


or f which means that f,g, f +g, and fg are all continuous at c, and — is continuous at 
g 


c if f is defined at c, which is true if g(c) 4 0, again by Theorem 10.15. 
g 


Theorem 10.21. [f Uj, is open in a metric space X for alla € J (where J is an arbitrary 
indexing set) then U Uq is open. 
acd 


Proof. Let p € U U,. Then p € Ug for some 6 € J, so for some € > 0 it follows that 
acd 


B.(p) C Ug Cc (4) Uy, so U Uj, is open. 
aed acd 


262 CHAPTER 10. STRUCTURE OF EUCLIDEAN SPACE 


n 
Theorem 10.22. If U;,U2,...,Un are open sets in a metric space X then () Uj, 1s open. 


i=1 


n 
Proof. If p € () U; then for each i < n we can find e; > 0 so that B.,(p) C U;. Hence, if we 


i=1 
n 


set € = min{é1, €2,...,€n} then B.(p) C () Uj. 
i=l 


Theorem 10.23. If Aq is closed in a metric space X for alla € J (where J is an arbitrary 


indexing set) then () Ag 1s closed. 
acd 


Proof. By DeMorgan’s Laws, X \ () A, = U X \ Ag, which is open. Hence, ai Ag is 
acd acd acd 
closed. 


n 


Theorem 10.24. If A, Ao,..., An are closed subsets of a metric space X sets then U A; 
i=1 
is closed. 


n n n 
Proof. By DeMorgan’s Laws, X \ U A; = () X \ Aj, which is open. Hence, U A; is closed. 
i=1 i=1 i=1 


Theorem 10.25. Let (X,p) be a metric space and let AC EC X. Then: (a) A is closed 
in E if and only if A contains all of its limit points which are in E. 
(b) A is open in E if and only if for every p € A there is ane > 0 so that Be(p)NE C A. 
(c) A is open in EF if and only if A is a union of open balls in E (open balls intersected 
with EB). 


Proof. (a) First assume A is closed in E and pick a closed set C so that CN E = A. Let 
p€ E\ A. Then p ¢ C. Since X \ C is open there is an open ball B.(p) C X \ C. Since 
B.(p)NC =, Be(p)N A = 9, so p is not a limit point of A. 

Next assume that A contains all of its limit points which are contained in FE and let 
p€(X\A)NE. Then p is not a limit point of A, which means that there is some B,, (p) 
which contains no points of A. Choosing such a ball for every p € (X \ A)N E we obtain an 
open set W = U B.,(p) which does not intersect A and contains all points of F \ A. 

pEe(X\A)NE 
Thus, X \ W is closed so EN (X \ W) = A is closed in EF. 

(b) Assume A is open in £. Then there is an open set U so that UN EF = A. Let pé€ A. 
Then there is an € > 0 so that B-(p) C U, which means that B.(p) NE C A. 

Assume that for every p € A there is an € > 0 so that B.,(p) OE C A. Let W = 


U B.,(p). Then W is open and WM FE = A, which means that A is open in E. 


pea 
(c) First, let A be open in E. For each x € A by part (b) we can choose €, > 0 so that 


B.,(a) NE C A. Then since each point of A is the center of such an open ball, we know 


263 


U B., (a) N E D A. However, since each B.,(x)M E is a subset of A we also know that 
LEA 
U (B., (2) NE) C A. Hence, U (B., (2) FE) = A. 
xEeA LEA 
Next, assume that A is a union of open balls U (B.,(z) N EF) in EF (where J is an 
xed 
arbitrary indexing set). Then A = (U B.,,(a)) NE, which is open in E£ since we know that 
xed 
U B., (x) is open by theorem 10.21. 
cet 


Note that if E = X then the preceding theorem states that a set is closed if it contains 
all of its limit points, and a set is open if and only if it is a union of open balls. 


Definition 73 


Let A be aset in a metric space X and let A’ be the set of limit points of A. The 


closure of A, denoted A= AU A’. The interior A° of A is the set of all points p of 
A so that B.(p) C A for some € > 0. The boundary O(A) of A is the set of all points 
p of X so that B.(p) contains a point of A and a point of X \ A for each € > 0. 


Theorem 10.26. Let A C X, where (X,d) is a metric space, and let p € X. Then 
p € O(A) if and only if every open set containing p contains a point of A and a point of the 
complement of A. 


Proof. Assume p € 0(A). Let U be an open set containing p. For some € > 0 we know that 
B.(p) C U. Since p € O(A) we know that B,(p) contains a point of A and a point of the 
complement of A, and thus U does as well. 

Assume that every open set containing p contains a point of A and a point of the 
complement of A. Let « > 0. Since we know B,(p) is open, it follows that B-(p) contains a 
point of A and a point of the complement of A, which means that p € O(A). 


Theorem 10.27. Let E be a subset of a metric space X. Then the closure of E is the 
intersection of all closed sets containing E and the interior of E is the union of all open 
sets contained in E. 


Proof. Let A be a closed set containing £. Then A contains all of its limit points. All limit 
points of & are limit points of A by Exercise 10.5, so A contains all limit points of F and 
therefore E C A, so E is a subset of the intersection of all closed sets containing E. If p 
is a limit point of E then every open set U containing p contains a point g of E distinct 
from p. If q is not an element of E then q is a limit point of EF, so U contains infinitely 
many points of & and therefore U contains points of E' distinct from p by Exercise 10.18. 
Hence, p is a limit point of E and therefore an element of E. Hence EF contains all of its 


264 CHAPTER 10. STRUCTURE OF EUCLIDEAN SPACE 


limit points and is closed. Thus, the intersection of all closed sets containing FE is a subset 
of E, and is thus equal to F. 

By definition, for each x € E®° we can find e, > 0 so that B.,(x) C E. Since B,, (x) 
is open we know that if y € B.,(x) then y € Bs(y) C B.,(x) C E, which means y € E°. 
Hence, B,.(x) C E®. Thus, B® = U B.,(«), which is open. Hence, E® is a subset of the 


xe ke 
union of all open sets which are contained in E. If V is an open set which is contained in E 


then for every p € V we can find y > 0 so that B,(p) CV C E. Thus, V C E°. It follows 
that the union of all open sets contained in EF is a subset of E° and is therefore equal to 
E®. 


Theorem 10.28. Let E C X, a metric space. Then 0(E) = E \ E°. 


Proof. Let x € O(E). Then for each € > 0 we know that B,.(x) Z E so no open subset of E 
contains x and « ¢g E°. Ife € Ethenz € E. If x ¢ E then z is a limit point of E since 
every open ball containing x contains a point of E. Hence, x € E. Thus, 0(F) C E \ E®. 
Let c € E\ E®. Since x ¢ E®, for each € > 0 the set B-(x) Z E, which means that 
B.(x) contains a point of the complement of E. Since x € E, either x € E or z is a limit 
point of E. If x € EF then every open ball about x contains a point of F (namely x). If x is 
a limit point of & then every open ball about x contains a point of E by definition of limit 
point. Thus, every open ball about x contains a point of FE and a point not in EF, which 
means x € O(£). 


Theorem 10.29. Let A,B CR”. Then: 
(a) AUB=AUB 
(b) ANBCANB 
(c) (AN B)? = ASN B° 
(d) (AU B)° D A°U B® 


Proof. (a) Let x € AUB. Then either x € AUB or x is a limit point of AU B. If 
x € (AUB) then x € AUB. If x is a limit point of AU B then either x is a limit point of 
A or x is a limit point of B, sox € Aorx€ B,soxe€ AUB. 

Let x € AUB. Then either x € A or x € B. Without loss of generality we may assume 
x € A. Thus, either x € A or x is a limit point of A. Ifx €¢ A then x € AUB. Ifxisa 
limit point of A then x is a limit point of AUB. Thus, x € AUB. Hence, AUB = AUB. 

(b) Let x € AN B. Either x € (AN B) or xis a limit point of ANB. Ifx € ANB then 
x € ANB. Ifx is a limit point of AN B then x is a limit point of both A and B, sox € A 
and « € B and thus x € ANB. Hence, ANB C ANB. 

(c) Let x € (AM B)°. Then there is an open set U so that x € U C (AM B). Since 
UC AandU C B it follows that x € A° and x € B®, soxe A°N B®. 

Let x € A°M B®. Then there are open sets V, V2 so that x E Vi C AandxE€ WCB, 
which means that x € (Vin V2) C (AN B), which means that x € (AM B)°. Hence, 
(AN B)® = ASN BS. 

(d) Let x € A° UB?®. Then either x is in an open set contained in A or in an open set 
contained in B. Withoug loss of generality, we may assume there is an open set U so that 
x€U CAC (AUB). Thus, x € (AUB)® and (AUB) D A° UBS. 


265 


Theorem 10.30. Open Set Characterization of Continuity. Let (X,d) and (Y,p) be metric 
spaces and let EC X. Let f: HY. Then: 
(a) (Local Form): The function f is continuous at a point p € E if and only if for every 
open set V containing f(p), there is an open set U containing p such that UNE C f7'(V). 
(b) (Global Form): The function f is continuous if and only if, for every open subset V 
of Y, the set f-1(V) is open in E. 


Proof. (a) First, assume that f is continuous at p € E. Let V be open in Y containing 
f(p). Since V is open there is an € > 0 so that B.(f(p)) C V. Since f is continuous there 
is a 0 > 0 so that if d(p,z) < 6 and x € E then p(f(x), f(p)) < ¢ for all x € E. Thus 
Bs(p) NEC f-(V). 

Next, assume that for every open set V containing f(p), there is an open set U containing 
p such that UM E C f-+(V). Let « > 0. Since B.(f(p)) is open, there is an open 
set U so that p € UNE c f 1(B.(f(p)), which means that there is a 6 > 0 so that 
B3(p)NE CUNEC f-1(B.(f(p)). Hence, if d(x,p) < 6 and x € E then p(f (x), f(p)) <, 
which means that f is continuous at p. 

(b) First, assume f is continuous, and let V be open in Y. If VN f(£) = 0 then f~'(V) 
is empty and therefore open. Assume VM f(E) 4 0. Let p € f~'(V). By (a) there is an 
open set U, in X so that p€ U,NE © 7: Choosing such an open set U, for each 
p € f '(V), we see that U = U U, is an open set in X so that UN E = f-'(V), 

pef-l(V) 
which is therefore open in EF. 

Next, assume that the inverse of every open set in Y is open in E. Let p € EF. Then for 
every open set V containing f(p), there is an open set U containing p such that UN E = 
f-'(V), which means that UNE C f~'(V). Hence, by part (a) it follows that f is continuous 
at p for every p € E, so f is continuous. 


Definition 74 


We say a function T : R” > R” is a linear transformation if T(ax+Gy) = aT (x)+ 
GBT (y) for all a, 8 € R and x,y € R”. Let T : R” > R” be a linear transformation. 


T 
Then the operator norm |T| = sup eo We use the notation e; to denote the ith 
x40 |x 
standard basis vector for R”, which is the vector e; =< 0,0,...,0,1,0,0,...,0 > where 


every coordinate entry of the vector is zero except the ith entry, which is one. 


n 
Theorem 10.31. Let x= (11,22, %3,...,%n) € R”. Then |a| < ~ las): 


i=1 


266 CHAPTER 10. STRUCTURE OF EUCLIDEAN SPACE 


n 
Proof. We note that x = S > xe, so by the Triangle Inequality (part (c)) we have that 


nm n n 
Ix] =| aie] < 5 |aies| = S— Ia. 
i=l i=l i=1 


Theorem 10.32. Let T : R” > R”™ be a linear transformation. Then the operator norm |T| 
exists and is a non-negative real muumnber. If the matrix for T is A with rows Aj, Ao,...,Am 


then |T| is less than or equal to vay |A;| < <> |ai;|. 


i=1 g=1 


Proof. We can find a matrix A = [aij]mxn so that T(x) = Ax by Theorem 14.10. Thus, if 
A; is the ith row vector of A then Ax = [Aj - xX]ixm. Let C = max{|Ail,...,|Am|}. Since 
|A; -x| < |A,||x| for each i by the Cauchy Schwarz inequality, we know that |T(x)| = 


“ T 
S Ape < ya ?|x/2< C Skee = Cy/m|x|. Hence, sue! @s) <CV/m < 
x40 


i=1 i=1 [x 


vn |A;| < a> |a;;| by Theorem 10.31. 


i=1 g=1 


Definition 75 


A collection C of sets is a cover of a set A C X if the union of C contains A. If 


the sets in C are open then we call C an open cover or open covering of A. We say 
that a set A C X is compact if every open cover C of A has a finite subset F’ which 
is also a cover of A. We call F a finite subcover of A. 


Theorem 10.33. Let K be a compact subset of a metric space X and let f : K > Y be 
continuous, where Y is a metric space. Then f() is compact. 


Proof. Let C = {Ua }aes be an open cover of f(A). For each a € J choose Vy open in X so 
that Vn K = f~'(U,). Then the collection of Va sets covers K and has a finite subcover 
Var, Vao;--,Va,- The corresponding open sets Ug,, Ua,,..,Ua, are a finite subset of C' which 
covers f(A). Thus f(k) is compact. 


Theorem 10.34. Bolzano- Weierstrass Theorem in R"™. 

(a) Let {a} be a sequence of points in R™ which is bounded. Then {x,} has a convergent 
subsequence. 

(b) Every bounded infinite set in R™ has a limit point. 


Proof. (a) Let a’, denote the ith component of x,. Since {x,} is bounded, it follows that 
each {a',} is bounded. By the Bolzano-Weierstrass Theorem in R we know that {a)} 
has a convergent subsequence {2ng,) Haven} — p,. Likewise, ea! has a convergent 


267 


2 
ae "(51 ,52) 
Ny doseatm)t 7 Dm: Then {Xn io _ jedh b — p = (p1, p2,---;Pm)- 
(b) Let F be a bounded infinite set in R’™. We choose a sequence of distinct points of 
FE inductively by choosing x; € FE and then if x, x9, ...,xz have been chosen to be distinct 
points of FE for some k € N, then since F is infinite there are points of EF which have not 
been chosen so we can pick x,41 € FE \ {x1,X2,X3,...,xx}. Then the sequence {x,} is a 
sequence of distinct points which is bounded and therefore has a subsequence {x,,,} which 
converges to a point p by (a). For every € > 0, this means that there is a t € N so that if 
i >t then x,, € B.(p), which means that B,.(p) contains infinitely many points of E and 
so p is a limit point of EF. 


subsequence {x } —+ po. We continue in this manner until we pick a subsequence 


Theorem 10.35. Let K be a closed and bounded set in R” and let f : K > R™ be 
continuous. Then f is uniformly continuous. 


Proof. Suppose f is not uniformly continuous. Then there is an € > 0 so that for every 6 > 0 

there are points x,y € K so that |x — y| < 6 and |f(x) — f(y)| > «. For each n € N choose 

points x,,y,, € K so that |x,—y,,| < — and|f(x,)—f(y,,)| => ¢. By the Bolzano Weierstrass 
n 


Theorem, we can find a convergent subsequence {x,,,} — p where p € K since K is closed. 
Since {|Xn,; — Yn,|} — 0 we also know that {y,,} — p. Since f is continuous it follows 


that {f(Xni)} 7 f(p) and {f(¥n,)} Tt f(p), and hence {| (Xnj) 7 f(Yn JI} > 0, which is 
impossible since | f(Xn;) — f(¥n,)| = € for alli ¢ N. Hence, f is uniformly continuous. 


Definition 76 


Let X be a metric space. We say that D is dense in X if every open ball in X 
contains a point of D. We say X is separable if there is a countable set D C X which 


is dense in X. We say that a collection B of non-empty open sets is a basis for X if 
for every x € X and open set U containing p there is some B € B so that pe BCU. 
We say that a set E C X is Lindelof if every open covering of FE has a countable 
subset which is a cover of F (a countable subcover). 


Note that by theorem 10.25, the epsilon balls about points of X form a basis for a metric 
space X. 


Theorem 10.36. A metric space X is separable if and only if X has a countable basis. 


Proof. Let X have countable basis B = {By}nen. Choose one point p, from each B,, to 
form a set D = {pj,p2,p3,...}. Each open set U contains a set B, containing a point 
pn € D, so D is dense in X. 

Let X be separable with countable dense set D = {pj1, p2, p3,...} and order the positive 
rational numbers as {q1, q2, q3,-..}. Let B(i,j) = Bg,(p;) for each pair of positive integers 


268 CHAPTER 10. STRUCTURE OF EUCLIDEAN SPACE 


i,j. Then let B = {B(i,7)}|2,7 € N}. Since B is a countable union of countable sets it is 
countable. Let U be any open set and p € U. Then there is some € > 0 so that B3.(p) C U 
and some py, € B,(p). Let € < gm < 2€ where qm is rational. Then p € Bg, (Pn) C Bse(p) C 
U. Thus, 6 is a basis for X. 


Theorem 10.37. R” is separable and has a countable basis. 


Proof. Let D = {x € R"|x; € Q for each 1 < i < n}, the set of all points each of whose 
coordinates is rational. Let U be any non-empty open set and choose B,.(p) C U, where 


n 
p = (pi, p2,---; Pn). Then [[: ahi | A C B.(p), and for each 1 < i < n we can 
n 


i=1 


€ € 
pick a rational number q € (pi pic ), which means that q = (q1, 92,93, ---. In) EU 
n n 


so D is dense in U. Thus, by theorem 10.36, we know that R” has a countable basis. 


Theorem 10.38. Let E CR”. Then E is Lindelof. 


Proof. Let C be an open cover of FE. Let B be a countable basis for R” and let B = 
{B,, Bo, B3,...} be the elements of 6B which contain a point of E and are also contained 
in an element of C. For each B; € B we choose some U; € C which contains B;, and let 
C = {U,, U2, Us,...}. For each p € FE there is some U € C which contains p and thus some 
B; € B so that p € B; C U which means that p € U;. Hence, C is a countable subset of C 
which covers E. 


Theorem 10.39. Heine-Borel Theorem. Let K be a subset of R™. Then K is compact if 
and only if K is closed and bounded. 


Proof. First, assume that K is compact. Then {B,(0)}nen covers K so it has a finite 
subcover F = {Bp,(0), Bn2 (0), ..., Bn;(O)} where nj < ng < ... < nj, so JF = B,,(0) 
contains K, and K is bounded. 

Next, let p be a limit point of K and suppose that p ¢ K. Then if we set U, = 


R™ \ Bi(p) then {U;}nen is an open cover of K and has a finite subcover 
F = {R” \ B. (p),R” \ B. (p),...,R” \ Ba (p)} where n1 < no < ... < nj, so UF _ 


R™ \ B.1.(p) contains K. But this is impossible because p is a limit point of K so B._(p) 
nj nj 
must contain points of K. 


Finally, assume that K is closed and bounded. Let C be an open cover of kK. Suppose C 
has no finite subcover. By theorem 10.38 there is a countable subcover {U;,}nen. For each 


nr 
n € N we choose a point p,, € K \ U U;. Since Kk is bounded, by the Bolzano- Weierstrass 
j=l 
Theorem there is a convergent subsequence {p,,} — Pp, where p € K since K is closed. 
Thus, there is some ¢ so that p € U; which means that there is a k € N so that ifi > k 
then p,,, € Uz, which we know is false if i > ¢, a contradiction. Hence, we conclude that K 
is compact. 


269 


Theorem 10.40. Let K be a compact subset of R. Then K has a first point and a last 
point. 


Proof. We know that K is closed and bounded by the Heine-Borel Theorem, which means 
that K has a least upper bound u and a greatest lower bound b. Suppose that u ¢ K. Then 
by the Approximation Property, for every « > 0 there is some x € K so that u-e <2 < u, 
which means that u is a limit point of kK. But this means that u € K since K contains all 
of its limit points. Likewise, b € K. 


Theorem 10.41. The Extreme Value Theorem (for metric spaces). Let X be a metric 
space and let f : K + R be continuous, where K is a compact subset of X. Then there are 
points s,t © K so that f(s) < f(x) < f(t) forallxae K. 


Proof. By Theorem 10.33, we know that f(A) is compact and therefore has a first and last 
point. 


Theorem 10.42. The Lebesgue Number Lemma. Let C be an open cover of a compact set 
K in a metric space X. Then there is a number 6 > 0 so that for any x € K the ball Bs(x) 
is a subset of an element of C. Furthermore, if S is a set with diameter less than 6 and 
SOK #90 then S is a subset of some element of C. 


Proof. For each x € K, we choose an €, > 0 so that Bo,.,(2) C W for some W € C. 
Then {B.,(x)|c € K} is an open cover of K, and since K is a compact space we know 
that there is a finite subcover F = {B,,, (21), Be,, (£2), Bez, (U3), ---Bez, (Un) }. Let 6 = 
TOV Gogo Egceeee fs 

Let x € K. Then for some 2; we know that x € B,,,(xi), so if y € Bs(x) then 
PY, Vi) < 2€z,, 80 Bs(x) C Boe, (zi) C W for some W € C. 

If diam(S) < 6 and x € KMS then S C B;(x) which is a subset of an element of C. 


Definition 77 


Let EC R”. A pair of non-empty sets H and K is a separation of E if H, K C E, 
Hn K =@=KNO4 and HUK = E. We say E C R” is connected if it has no 
separation. We say that E is path connected if, for each pair of points p,q € FE there 


is a continuous function f : [a,b] > E so that f(a) = p and f(b) = q. We say 
that E is polygonally connected if, for each pair of points p,q € E there is a finite 
sequence of line segments L(x, x2), L(x2, x3),..-, L(Xm—1,Xm) which are contained 
in F so that x; = p, and x,, = q. We refer to the union of these line segments as a 
polygonal path from p to q. If L(p,q) C E for each p,q € F then EF is convex. 


270 CHAPTER 10. STRUCTURE OF EUCLIDEAN SPACE 


Theorem 10.43. Let E C R”. Let H and K be disjoint non-empty subsets of E so that 
HUK=E. Then H and K are a separation of E if and only if H and K are open in E. 


Proof. First, assume that H and K are a separation of E. Then H contains no limit 
points of K’, so for every point p € H we can find ep > 0 so that B,.(p) A K = 0. Thus, 


= (4) B.,(p) is an open set and UNE = H, which means that H is open in LE. Likewise, 


ped 
K is open in E. 


Next, assume that H and K are open in &. We already know that H and K are disjoint 
non-empty subsets of EF so that H UK = E, so we need only verify that H contains no 
limit points of K and K contains no limit points of H. Let p € H. There is an open set V 
so that VM E = A since H is open in E. Hence, p is contained in an open set which does 
not intersect K, so p is not a limit point of K. Similarly, K contains no limit points of H. 


Theorem 10.44. Let C be a connected subset of R” and let f : C > R™ be continuous. 
Then f(C) is connected. 


Proof. Suppose f(C) is not connected. Then it has a separation H and K, where H = 
UN f(C) and K = VN f(C) for some open sets U and V in R™, where H and K are 
non-empty, disjoint and have H UK = f(C). By theorem 10.30, this means that f~!(H) 
and fuk ) are open in C. Since they are also disjoint and non-empty and have union 
equal to C, it follows that C is not connected, a contradiction. 


Theorem 10.45. Intermediate Value Theorem in R". Let H be a connected subset of R” 
and let f : H +R be continuous, where f(a) < k < f(b) for some a,b € H. Then there is 
some c€ H so that f(c) =k. 


Proof. From theorem 10.44 we know that f(#) is connected and is therefore an interval by 
theorem 7.51. Thus, by the definition of interval, k € f(H). 


Theorem 10.46. Every polygonally connected set is path connected and every path connected 
set is connected. 


Proof. Let E be polygonally connected. Let p,q € &. Then there are line segments 
L (xo, X2), L(x2, x3),..., L(Xm-1,%m) CG E where xp = p, and x, = q, whose union is a 
polygonal path P(p,q). Define f : [0,m-+ 1] by f(x) = (¢+1— 2)x; + (x — 7)x;41 for all 
i<a<i+t+l, for each 0 <i<m-—1. Then f is continuous function whose image is the 
polygonal path P(p,q). Thus, F is path connected. 

Assume that EF is path connected. Suppose EF is not connected. Then E has a separation 
consisting of sets H and K which are non-empty, neither of which contains a limit point or 
point of the other, and whose union is £. Let p € H andq € K. Since E is path connected 


271 


. Since |a, b] is 


there is a continuous function f : [a,b] > E so that f(a) = p and f(b) =q 
= C is a connected 


( 
connected by Theorem 7.51 and f is continuous, we know that f (Ja, }]) 
set by Theorem 10.44. 

Since H and K are a separation of FE, we know that H and K are open in £, so there 
are open sets U and V so that UN E=H and VNE=K. Let H' =HOAC=UNC and 
K' =KAC=VNC. Then H’' and K’ are open in C and are a separation of C, which 
contradicts the fact that C' is connected. We conclude that FE is connected. 


Theorem 10.47. Let U be a connected open subset of R”. Then U is polygonally connected. 


Proof. Let a € U. Let S = {x € U| there is a polygonal path P(a,x) = L(xo,x1) U 
L(x1,X2) U...U D(&m_1,Xm) C U from a = xg to x = x,,}. We will prove that S is open, 
and that S and closed in U. 

To show S is open, let p € S. Then p € U so we can find € > 0 so that B.(p) C U. 
Let z € B.(p). Since p € S there is a polygonal path P(a, p) = L(a,x;) U L(x1, x2) U...U 
L(xm-1,p) C U. Thus, L(a,x1) U L(x1, x2) U... U L(xm-1, p) U L(p,z) is a polygonal path 
from a to z which is contained in U, which means that B.(p) C U and therefore S is open. 

To show that S contains all limit points of S which are contained in U, let qe U bea 
limit point of S. Let y > 0 so that B,(q) C U. Then B,(q) contains a point w € S. Since 
w € S, there is a polygonal path P(a, w) = L(a,x1)UL(x1, x2)U...UL(xm-1, w) C U. But 
then P(a,w) U L(w,q) is a polygonal path from a to q which is contained in U, meaning 
that q Ee U. 

Either S = U, in which case U is polygonally connected, or not. If not then let H = S$ 
and let K = U\S. Then H,K are disjoint, non-empty and have union equal to U, and 
since H is open we know that H contains no limit points of K, and since H contains all of 
its limit points which are in U, it follows that K contains no limit points of H. Hence, H 
and K are a separation of U, which is impossible since U is connected. We conclude that 
S =U and U is polygonally connected. 


Theorem 10.48. Let f : [a,b] > R and let G = {(x, f(x)) € R?|x € [a,b]} be the graph of 
f. Then f is continuous if and only if G is both closed and connected. 


Proof. First, assuming f is continuous we know that F(a«) = (a, f(x)) is continuous by 
Theorem 10.17, which means that F'([a,6]) = G is connected and compact and therefore 
closed (by Theorems 10.44, 10.33 and the Heine-Borel Theorem). 

Next, assume that G is connected and closed. We first check that f satisfies the property 
that ifa <c<d< band f(c) < r < f(d) or f(c) > r > f(d) then f(q) = r for some 
q € (c,d). Suppose f(c) < r < f(d) and there is no q € (c,d) so that f(q) = r. Then 
H = {(x,y) € Gla <corx<dandy<r}and Kk ={(z,y) € Gla >dorz>candy>r} 
is a separation of G, contradicting that G is connected. In the case that f(c) > r > f(d) we 
can negate f and apply the previous case. We will refer to this condition as the intermediate 
value property. 

Suppose that f has a discontinuity at a point c € [a,b]. Then we can find € > 0 so that 
for every 6 > 0 we can find x € [a,}] so that |x — c| < 6 but |f(x) — f(c)| > «. For each 


272 CHAPTER 10. STRUCTURE OF EUCLIDEAN SPACE 


1 
n € N choose zy, € [a,}] so that |r, — cl < - and |f(a,) — f(c)| > «. Either there are 


infinitely many integers n so that f(xz,,) > f(c) +e or there are infinitely many integers n so 
that f(rn) < f(c)—e. Assume there are infinitely many integers n so that f(x,) > f(c) +e, 
yielding a subsequence {n;} so that f(x@n,) > f(c) + € for each i EN. 

Since f has the intermediate value property, for each i € N we can find a point c so 
lc; — ec] < |an, — c| so that f(c¢;) =c+e. Then {(c;, f(c) + «)} > (c, f(c) +), which means 
that (c, f(c) + €) is a limit point of G, so G is not closed (since G contains only the point 
(c, f(c)) on the line x = c). The argument is similar if there are infinitely many integers n 
so that f(x,) < f(c) — e. This contradiction implies that f is continuous. 


273 


Exercises: 


Exercise 10.1. Let E CR” and let pg E. Then p € O(E) if and only if p is a limit point 
of E. 


Exercise 10.2. Let U CR”. Then U is open if and only if U contains none of its boundary 
points. 


Exercise 10.3. Let A C R”. Then A ts closed if and only if A contains all of its boundary 
points. 


n 

Exercise 10.4. If p is a limit point of U E;, in metric space (X,d) then p is a limit point 
i=1 

of Ex, for some k. 


Exercise 10.5. If p is a limit point of A and AC B in a metric space X then p is a limit 
point of B. 


Exercise 10.6. A point p is a limit point of a set E in a metric space (X,d) if and only if 
every open set containing p contains infinitely many points of E, which is true if and only 
if there is a sequence of points in E \ {p} which converges to p. 


Exercise 10.7. If Ey, E2 are connected subsets of R” which share a common point p then 
Ei, U E2 is connected. 


[oe] 
Exercise 10.8. Give examples of open sets U,,U2,U3,... so that () U; is not open. 
i=l 


Exercise 10.9. Show that no subset of Q” containing more than one point is connected. 


Exercise 10.10. The empty set is the only proper subset of R” which is both closed and 
open. 


Exercise 10.11. Let ECR". Then E = EVO(E). 


Exercise 10.12. Jf AC K CR” and K is closed then A is closed in K if and only if A is 
closed. 


Exercise 10.13. A point p of a metric space (X,d) is a limit point of a set E C X if and 
only if every open set containing p contains infinitely many points of E. 


274 CHAPTER 10. STRUCTURE OF EUCLIDEAN SPACE 


Solutions: 


Solution to Exercise 10.1. Let EF C R” and letp ¢ E. Then p € O(E) if and only if p is 
a limit point of E. 


Proof. Every open set containing p contains a point not in EF (namely p). Thus, p is a 
limit point of F if and only if every open set containing p contains a point of F (since such 
a point will always be distinct from p), which is also true if and only if p is a boundary 
point of E. 


Solution to Exercise 10.2. Let U CR”. Then U is open if and only if U contains none 
of its boundary points. 


Proof. Assume U is open. Let p € U. Then since U is open there is an open ball containing 
p which is contained in U and therefore contains no points which are not contained in U, 
so p is not a boundary point of U. 

Assume that U contains none of its boundary points. Let p € U. Since p € U we 
know p is not a boundary point of U, which means that there is an open ball containing p 
which contains no points which are not in U (since such a ball must contain p and therefore 
contains a point of U). Hence, p is contained in an open set which is contained in U, which 


means that U is open. 


Solution to Exercise 10.3. Let A C R”. Then A is closed if and only if A contains all 
of its boundary points. 


Proof. First, assume that A is closed. Since it has been established that every boundary 
point of A which is not in A must be a limit point of A, and also that closed sets contain 
all of their limit points, it follows that every boundary point of A is a point of A. 

Next, assume that A contains all of its boundary points. Then if p ¢ A it follows that 
p is not a boundary point of A, which means that p is not a limit point of A since all limit 
points of A which are not contained in A are boundary points of A. Thus, A contains all 
of its limit points and is therefore closed. 


nr 
Solution to Exercise 10.4. If p is a limit point of U E;, in metric space (X,d) then p is 
i=1 
a limit point of Ex for some k. 
Proof. Suppose p is not a limit point of any &;. For each i we can choose an e¢; > 0 so 
that B.,(p) contains no points of E; distinct from p. Let « = min{e1, €2,...,€n}. Then B.(p) 
n 


contains no points of (g FE, distinct from p, a contradiction to p being a limit point of 
i=1 


275 


Solution to Exercise 10.5. If p is a limit point of A and A C B in a metric space X 
then p is a limit point of B. 


Proof. If p is a limit point of A then every open set containing p contains a point of A 
distinct from p, which is also a point of B distinct from p since every point of A is a point 
of B. Thus, p is a limit point of B. 


Solution to Exercise 10.6. A point p is a limit point of a set E in a metric space (X, d) 
if and only if every open set containing p contains infinitely many points of E, which is true 
if and only if there is a sequence of points in E \ {p} which converges to p. 


Proof. By Theorem 10.12, p is a limit point of EF if and only if we can find a is a sequence 
{tn} C X \ {p} which converges to p. 

Let U be an open set containing p. Suppose there are only finitely many points 
P1;P25--- Pk € E \ {p}. Choose 0 < € < pin ta(p, pi)} so that B.(p) C U. B-(p) contains 


no points of E distinct from p, which is a contradiction. 

Next, assume every open set containing p contains infinitely many points of FE and let 
6 > 0. Then the open set Bs(p) contains infinitely many points of F and therefore contains 
points of F distinct from p. Hence, p is a limit point of E. 


Solution to Exercise 10.7. If E,, E2 are connected subsets of R” which share a common 
point p then Ey U E2 is connected. 


Proof. Suppose EF; U Ep» is not connected and p € £,M Ey. Let A,B be a separation of 
EU E> and let p € A. Then there are open sets U,V in R” so that A = UN (E; U Ep) 
and B = VN (E£, U Ey). If B contains points of E; then we know that UN EF, = B, 
is non-empty and since A contains p we know that UM FE, = A; is non-empty. We also 
know that A; U By = FE, and that both A; and B, are open in EF; which means that EF, 
is not connected, which is impossible. We must conclude that B contains no points of Ej. 
However, a similar argument shows that B contains no points of E2, so B is empty, which is 
a contradiction to the assumption that A and B are a separation of EF, U FE. We conclude 
that Ey U E> is connected. 


[o-e) 
Solution to Exercise 10.8. Give examples of open sets Uy, U2, U3,... so that aa U;, is not 


i=l 
open. 


(oe) 
Proof. Let U; = Bi (0) for each natural number i. Then a U; = {0} is not an open set. 
i=l 


276 CHAPTER 10. STRUCTURE OF EUCLIDEAN SPACE 


Solution to Exercise 10.9. Show that no subset of Q” containing more than one point is 
connected. 


Proof. Let p = (pi, p2,---;Pn) and q == (q1,4q2,---, dn) be distinct points in a set S C Q”, 
where |p — q| = ne. For each integer 1 < i < n choose irrational numbers s;,t; so that 


n 
DR-e< 3 <D<t<ptre LetU = [ [G4 and let V = R" \U. Then UNS is 
non-empty since it contains p, VMS is ot ae since it contains q. Both sets are open in 
S. Furthermore, if 2 = (21, 22..., Zn) is a limit point of U which is not contained in U then 
some z; must equal either s; or ¢; since if z; > t; or z; < s; then {(z, r2...,%) € R"|2; > t; 
or x; < s;} is an open set containing z which does not intersect U. But this means that 
z ¢ S since one of its coordinates is not rational. Thus, U is both closed and open in S, 
and so is V, which means U and V are a separation of S. 


Solution to Exercise 10.10. The empty set is the only proper subset of R” which is both 
closed and open. 


Proof. First, we show that R” is connected. Suppose R” has a separation H,K and let 
p € H and q € K. For some open sets U,V it follows that H = UMR” = U and 
kK = UR" = V which means that H and K are open sets. Define f : [0,1] > R” by 
f(x) =p+t(q—p). Then f is continuous, so since the continuous image a of a connected 
set is connected, L(p,q) = f([0,1]) is connected. But then HN L(p,q) and KM L(p,q) are 
non-empty, their union is L(p,q) and the are open in L(p,q), making them a separation 
of L(p,q), which is a contradiction. We conclude that R” is connected. 

Let U be an open proper subset of R”. If U is also closed then R” \ U is open, making 
U,R” \ U a separation of R”, which is impossible since R” is connected. 


Solution to Exercise 10.11. Let ECR”. Then E = EVO(E). 


Proof. We know that E consists of the points of E and the limit points of E. Let p be a 
point which is not in E. Then every open set containing p contains a point not in F (namely 
p). Thus, p is a limit point of E if and only if every open set containing p contains a point 
of E (since such a point will always be distinct from p), and p is a boundary point of E if 
and only if every open set containing p contains a point of FE. Since any point not in EF is 
in the boundary of EF if and only if it is in the closure of FE, we see that FE = EUO(E). 


Solution to Exercise 10.12. Jf A C K C R” and K 1s closed then A is closed in K if 
and only if A is closed. 


Proof. Let A be closed. Then AN K = A is closed in Kk by definition. Assume A is closed in 
k. Then for some closed set B we know BN K = A. Since K is closed and the intersection 
of closed sets is closed, BM K = A is a closed set. 


277 


Solution to Exercise 10.13. A point p of a metric space (X,d) is a limit point of a set 
ECX if and only if every open set containing p contains infinitely many points of E. 


Proof. Suppose an open set U containing p contains at most finitely many points q1, qo, ---, Im 
of F distinct from p. Choose € > 0 so that B-(p) C U. Let 6 = min{e, d(p, q1), d(p, q2), d(p, 3, ), ---; U(p, Im) }- 
Then B5(p) contains no points of F distinct from p, contradicting the assumption that p is 
a limit point of E. 


Chapter 11 


Differentiation in Higher 
Dimensions 


Definition 78 


Let f : V > R”, where V is an open set in R”. Then the partial derivative of f at 
= OF 0) = tim Le + hee) = FO) 
z 


h-0 


x € V with respect to the variable x; is fz, (x) 


k 
Let f =< fi, fo,..-, fm >, where each f; : V > R. The partial derivative of Fisey 
# fi 


for any natural numbers p,j <n. In 


ee : OF fi 
a similar manner, we can inductively define fi, ,.,. = and 5 to be 
hea aaa Li, OX; OL; 
Jk? Jk-1 J1 


the kth partial derivative of f; which is the derivative of fj, 


192° 3p—1 


with respect to x; is denoted Tope, = 


;OLp 


with respect 


to z;,. We say that f is C* if all kth partial derivatives of all components of f are 
continuous. We say that f is C% if f is C® for all natural numbers k. 


We also define mixed partials on the function f itself in the preceding definition in the 
_ Of 
52°F ik OL 08% 4 100y, , 
Partial derivatives are different from total derivatives of multivariable functions. The 
derivative of a function f : R” > R™, denoted Df (x) is also referred to as the differential 
df, in some texts, and is a linear transformation, usually written as an m by n matrix. 


same way, defining fr,, 


Definition 79 


Let f : V — R"™, where V is an open set in R”. The derivative 
of f at x is the unique transformation matrix Df(x) (or dfx) satisfying 
fm +) — fx) — DF )hI 
im 
h-0 \h| 
say that f is differentiable at p if f has a derivative at p and that f is differentiable 
if f is differentiable at every point of the domain of f. 


= 0 if such a transformation Df(x)h exists. We 


278 


279 


If f : V — R and the partial derivatives of f exist at a point x € V, then the 
gradient of f atu x is the vector V f(x) =< j5,, Jr... /2, =. 50, 1 2 — flay) 
then Vf(a,b) =< fz(a,b), fy(a,6) >, whereas if w = f(x,y, z) then Vf(a,b,c) =< 
fc(a, b,c), fy(a, b, €), fz(a, b,c) > 

If A is an m X n matrix and T(x) = Ax is a linear transformation from R” into 
R™ then we denote |A| = |T|, the operator norm of T. 

We use the notation A (x) = det Df (x) if f : R” > R” is differentiable. This is 
referred to as the Jacobian of f at x. 


Imxn- If the first partial derivatives 


0 
It turns out that if the derivative exists then it is Osi 


Ox 
are continuous at x then the derivative always exists (these things are shown below). Note 
that we also sometimes refer to the derivative as the transformation itself rather than the 
matrix generating the transformation by matrix multiplication. It is more convenient for 
us to think of the derivative as the matrix most of the time, with the understanding that 
the matrix is used to give the corresponding transformation. 


Theorem 11.1. Let f : U > R™”, where U is open inR”, p € U and f(a) =< fi (a), fo(a),..., fmn(2) >, 
where each component function f; : U > R. Then f is differentiable if and only if each 
function f;, is differentiable. 


Proof. Let A be any m x n matrix with rows Aj, Ao,.., Am (listed from the top row to 

the bottom row). Then by Theorem 10.17, jim f(p +h) We = 
— 

— fi(p +h) — fi(p) — Ai-h 

h->0 \h| 


A= Df(p) so that lim 
h-0 


= 0 if and only if 


= 0 for each 1 <i <n, which means that there is a matrix 


f(p +h) — f(p) — Ah 
|| 


= 0 if and only if, for each 1 < i < m, there 


at h) — a = A; -h 
is a row vector (a 1 x n matrix) A; so that lim fi(p +h) “ 
h-> 


= 0. The result 


follows. 


Theorem 11.2. Let f : U — R™, where U is open in R", p€ U and let A be anmxn 
matrix. Then f is differentiable at p with derivative A = Df(p) if and only if there is a 


function e(h) : U + R™ so that f(p+h) — f(p) = Ah+ c(h) on U, where lim EY = 0), 


0 |hl 
Proof. First, let f be differentiable at p with A = Df(p). Let e(h) = f(p+h) — f(p) - 
Df (p)h. By definition of derivative, we know that je OD os lim f(p +h) ~ f(p) ~ Di(p)h = 
|h| ho |h| 
0. 


Next, assume there is a function e(h) : U > R™ so that f(p +h) — f(p) = Ah+ 
h h 
e(h) on U for some matrix A, where jim Te = 0. Then that means jim 1 = 


jim, Ap +h) we == = 0, so f is differentiable and A = Df (p). 
— 


280 CHAPTER 11. DIFFERENTIATION IN HIGHER DIMENSIONS 


Theorem 11.3. Let f : V — R™, where V is an open subset of R” and p € V and f is 
differentiable at p. Then f is continuous at x. 
Proof. By Theorem 11.2, there is a function e(h) : V + R™ so that f(p +h) — f(p) = 
h 
Df (p)h + «(h) on U, where lim am) = 0. Hence, lim f(p +h) — f(p) = lim Df(p)h+ 
h-0 \h| h-0 h-0 
e(h) = 0, which means that jim f(p +h) = f(p) and thus lim f(x) = f(p); so: f is 
—_> x p 


continuous at p. 


Theorem 11.4. Let f : V > R, where V is an open subset of R” and x € V and f is 
differentiable at x. Then Df(x) = V f(a). 


f(x+h) — f(x) — Df(x)-h 


Th = 0. Thus, in particular, for any 


Proof. We know that lim 
h-0 


f(x + he;) — f(x) — Df (x) - he; i 


h 
Hence, if Df (x); is the ith coordinate of Df (x) then lim pAb Ss He) = OD fs = 
> 
and lim 


he;) — 
lim ist ©) P(x) = Df(x);. Thus, by definition, f,,(x) exists and is equal to 
— 
Df (x); for each 1 <i <n, so Df (x) = Vf (x). 


given 1 <i <n we know that if h = e; then jim 
— 


0 


Theorem 11.5. Let f : V — R”™ be differentiable at x, where V is an open subset of R” 
Ofi 
and «EV. Then Df (x) = ee 


J 


If(x +h) — fx) — Df(x)hl 


Th = 0, which is true if and only if 


Proof. We know that lim 
h-0O 


fan Lio +h) — file) — Df) “h 
h-0 \h| 
Df (x). By theorem 11.4, we know that Dfi(x) = Vfi(x), which means that Df(x) = 


= 0 for each 7, where Df;(x) represents the ith row of 


Lp, ,)mxn- 


J 


Theorem 11.6. Let f : V — R, where V is an open subset of R" and « € V so that the 
partial derivatives of f exist on V and are continuous at x. 

(a) Then f is differentiable at x. 

(b) If the partials of f are continuous on V and K is a compact subset of V then for every 
€>0 there is ad >0 so that ifxe K and |h| < 6 then f(a+h) = f(x) +Vf(z)-h+ R(h), 
where |R(h)| < eh]. 


Proof. We know that if the derivative exists then it equals the gradient. We will show that 
the gradient satisfies the definition of derivative. Choose r > 0 so that B,(x) C V and let h 
be a vector so that |h| < r, where h =< hy, ho,..., hyn > and x =< 21, %9,...,%p, >. Note that 


281 


f(xt+h)— f(x) = (f(a1+hi, 22 +he,..., on thn) —f(ai thi, £2 +he, .,2n—-1+hn_1,2n)) + 
CARE its tae Rovian aa se iad ee) Sf (er ges Fe pce 0 yo <a, J) a 
+ (f (a1 + hy, £2,..,Ln) — f(£1, £2,...,Ln)). For each 1 < i < n we define g;(t) = f(a1 + 
hy, 22 + he,..., x5 + t, Visi, pipes) Note that 

f(mithi,..., vi tt +h, visi, ...,¢n) — f(a + hi,..., ci +t, i441, ..., Ln) 


g(t) = lim = fa;(ait 
hy, ...,%j +t, Vj41,..-,Ln). Since g is a function of one variable, by the Mean Value Theorem 
we can find, for each such integer i, a point c; between 0 and h;, so that g’(c;)(h; — 0) = 
g(hi)—g(0) — f(ai thi, xa+he, wy DEAN, Li41, Jwslnd F (#4 } hy, x2 tho, voy Lbs Vids sath) 
unless h; = 0 in which case the Mean Value Theorem does not apply, and we set c; = 2; 
and the equation g’(c;)(h; — 0) = g(hi) — g(0) = f (a1 t+ hi, 22 +ha,..., 23 + hi, 2541, «-) 2n)— 
f(a + hi, 22 + ha, ..., Li, Li41, ---; Ln) reduces to 0 = 0 = 0, which is still true. 


Thus, it follows that f(x +h) — = dai Gig = 2 fr, (@1 + hi, £2 + ha,..., 05 + 
Ci, Li41, ++) Un)hy = 0(h)-h, where 6(h) is the bee whose ith coordinate is fr,(ai thi, cat 
hg, +, Li +Cj, Li41,--,0n). Thus, f(x+h)—f(x)—Vf(x)-h= oi fo, (1 +h1, 2+ha, ...,2 


i=1 
Ci, Vit, ---) Xn) — fr, (€1, La, ..., 2n)) hi = h-(d(h)—V f(x)). But since the c¢ are between 0 and 
h; we know that each c; approaches zero and each corresponding x; +c; approaches x; as h 
approaches zero. Since each fz, is continuous at x, it follows that jim, d(h) -—V f(x) =0. By 
— 
|h- (6(a) —V FG) — Tll6Ga) — VFC)I 
|b ~ (hh 


the Cauchy Schwarz inequality we know that 
th) — v0), fig EH) = F)— FC) 
[hy 
(b) For each x € K choose an €x > 0 so that B..(x) C V. Since K is compact and 
{ B.,.(x)}xex is an open cover of A, we can find a finite subcover F = {B.,, (Xi) }i<i<k- 


= 0 by the Squeeze Theorem. 


By the Lebesgue Number Lemma we can find a y > 0 so that if S is a set in R” whose 


diameter does not exceed y and SN kK #4) then S Cc B.,,(x;) for some i. Note also that 
k 
A= U B,,,(%i) is a compact subset of V. 
i=1 
For each i € {1,2,...,n}, the partials f,, are uniformly continuous on H by Theorem 
10.35, so we can choose 6; > 0 so that if |x — y| < 6; and x,y € A then |f,,(x) — fr,(y)| < 


7a Let 6 = min{7, 61, 62,..., dn}. 
We note then, from the proof of (a), that |d(h);— fx, (x)| < —“_ for each i € (gD ce PAL 
n 


Jf(x +h) — f(x) — VF(x) hl _ [h- Oh) — VEC) ES 
[h| [| 7 


|h| < 6. As in part (a) we see that 


Se = l6th) - vft%)1 = (00m ~ fas(0))? < Vn = 


i=1 vn 
Since pA Soa a = Qh <, if we set R(h) = f(x+h) — f(x) —Vf(x)-h 


then we see that |R(h)| = |f(x + h) — f(x) — Vf(x)- hl] < eh], and f(x+h) = f(x) + 
Vf(x)-h+ R(h). 


282 CHAPTER 11. DIFFERENTIATION IN HIGHER DIMENSIONS 


We next extend the previous result to functions from open sets in R” into R™ (instead 
of just R). 


Theorem 11.7. Let f : V > R"™, where V is an open subset of R” and a € V so that the 
partial derivatives of f exist on V and are continuous at x. Then: 

(a) f is differentiable at x. 

(b) If the partials of f are continuous on V and K is a compact subset of V then for every 
€>0 there is ad >0 so that ifxe K and |h| <6 then f(a+h) = f(x) + Df(a)h+ Rh), 
where |R(h)| < €|hl. 


Proof. (a) By theorem 11.6, we know that for each 1 < i < m it is the case that f; is 
differentiable at x, which means that f is differentiable at x by Theorem 11.1. 
(b) By 11.6 we know that for each i € {1,2,3,...,} we can find 6; > 0 so that ifx € K 


and |h| < 6; then f;(x +h) = f;(x) + Vfi(x)-h+ R,(h), where |R;(h)| < sah so that 


ates BSS VA) Bly a for each é € {1,2,3,...,n}. 


(hl 
Let 6 = min{0dj, 2,...,6n}. If |h| < 6 it follows that nS Le ~ Diehl = 


“ (fi(x +h) — fi(x) — Vfi(x) -h)? 
oS InP 

Set R(h) = f(x+h)— f(x)—Df(x)h. Then |R(h)| = |f(x+h)— f(x)—Df(x)h| < eh], 

and f(x +h) = f(x) + Df(x)h+ R(h). 


<e. 


Theorem 11.8. Clairaut’s Theorem. Let f : V — R be C?, where V is an open set in R?. 
Then fey = fyx on V. 


Proof. Let a = (xo,yo) € V. Since V is open we can find an € > 0 so that B-(a) C V. 
Then for h,k so that |h],|k| < 5 it follows that (ao + h,yo +k) € V and we can define 


d(h,k) = f(o + h,yo + k) — f(xo, yo + k) — f(xo + h, yo) + f (x0, yo). Define gi(t) = 
f(xo+th, yo+k) — f(xo+th, yo). Then gi is differentiable and g;(1)—g1(0) = d(h, k) and by 
the Mean Value Theorem there is a point cp € (0,1) so that gi (cn)(1—0) = gi(1) — 91 (0) = 
d(h,k). But by the chain rule, gi (ch) = h(fr(xo + enh, yo + k) — fr(xo0 + enh, yo)), 80 
d(h,k 
an) = fx(to + chh, yo + k) — fe(%o + chh, yo). Next define go(t) = fe(%o + cnh, yo + tk) 
and note that g2(1) — g2(0) = fr(ao + crh, yo + k) — fe(ao + cnh, yo), so by the Mean 
Value Theorem there is a point cz € (0,1) so that gh(cz)(1 — 0) = go(1) — g2(0) = a Ay 
By the chain rule, we know that 95(ck) = kfry(vo + cnh, yo + chk), which means that 
h 
fry(%o + enh, yo + Kk) = a Since f is C? on V we know that Pacer foy(e+ 
d(h, k) 
i= fulcnesA 
chh, yo + chk) = fry(o, Yo) CET 
We then define g3(t) = f(ao +h, yo + tk) — f (x0, yo + tk) and note that g3(1) — g3(0) = 
d(h,k) and by the Mean Value Theorem there is a point g, € (0,1) so that 94(q,)(1 —0) = 
g3(1) — g3(0) = d(h,k). But by the chain rule, 93 (qx) = k(fy(o +h, yo + ek) — fy(o, yo + 


283 


d(h,k 
qrk)), 80 iA fy(xo +h, yo+ nk) — fy(o, yo + 9ek). Finally, let ga(t) = fy(xo +th, yo+ 
qk) and note that g4(1) — g4(0) = fy(xo +h, yo + akk) — fy(X0, Yo + Grek), so by the Mean 
h 
Value Theorem there is a point g_ € (0,1) so that g/(qn)(1 — 0) = ga(1) — ga(0) = al ae 


By the chain rule, we know that gi(qn) = Rfye(to + anh, yo + aek), which means that 


h 
fyx(%o + Mh, yo + Uk) = _ Since f is C? on V we know that ee fya(%o + 
d(h, k) 


i 
(h,k)_» (0,0) hk 


enh, yo + chk) = fyx(®o, Yo) = = fry (Xo, Yo): 


Since partial derivatives are calculated with other variables fixed, we can also extend 
Clairaut’s Theorem to switching the order of any pair of variables for a multivariable mixed 
partial. This is addressed in the exercises. 


It is sometimes useful to use the following alternate definition of differentiability, which 
we show is equivalent to the one we gave. Either can be considered the definition of 
differentiable for a function from R” into R. 


Theorem 11.9. Let f : V > R where V is open in R”. Then f is differentiable at x € V 
if and only if there are functions e(h) for 1 <i<n so that f(x+h) — f(x) -—Vf(a)-h= 


Ss" e;(h)h; where lim e;(h) = 0 for each i. 
h->0 


i=1 


Proof. First, we assume that f is differentiable at x. Then 


f(x+h) — f(x) -Vf(x)-h 


lim = 0. We then define 

h->0 \h| 

iy Oe OOD pind 
int Ii 

sees San VIGOAD: 5 cy. hen 


Sei(h)hi = fed) a Py ay = f(x+h) — f(x) — Vf(x)-h. Since 
i=1 w1l"% i=1 
S— |hi| > |h| and lim, 


i=1 


|f(x +h) — f(x) — Vf(x)-h| 
[hh 

lim |e;(h)| = lim | f(x + h) — F(x) —Vf(x)-hl 

ad Ee See [ha 

Next, assume that there are functions ¢;(h) for 1 <7 < n so that f(x +h) — f(x) — 


= 0, it follows that 


= 0, so lim e;(h) = 0. 
h-0 


Vi (x)-h= » e;(h) hy where jim e;(h) = 0 for each i. 


f(x+h)—f@)-Vf@®)- bh Galhyhi 
[hy = im 


0 for each 7. Thus, f is differentiable. 


= 0 since ad 
|h| |h| 


Then lim < land lim e;(h) = 
h->0 h-0 


284 CHAPTER 11. DIFFERENTIATION IN HIGHER DIMENSIONS 


Sometimes it looks nicer to represent the coordinates of h in the preceding theorem 
as changes in the variables. So, in two coordinates you might represent this vector as 
h = (Az, Ay) for instance. Using this format, for the specific case of n = 2 and n = 3 
theorem 11.9 can be stated as follows. 


Let f : V + R where V is open in R?. Then f is differentiable at (x,y) € V if and 
only if there are functions €,(Azx, Ay) and €2(Aa, Ay) so that f(a+Aa,y+Ay)-— f(x, y)- 
fr(x, ybAx—fy(x, yJAy = a (Az, Ay) Ar+eo(Az, Ay)Ay where lim ei Ze Day). = 

(Az,Ay)— (0,0) 
0= lim €9(Az, Ay). 

(Ax,Ay)— (0,0) 

Let f : V > R where V is open in R°. Then f is differentiable at (x,y,z) € V if and 
only if there are functions €,(Aa, Ay, Az), e2(Aa, Ay, Az) and e3(Ax, Ay, Az) so that 
Te ae Az,y + Ay, z + ig) = F (2; y,2) =< faleg: z)Ax - fyla,y, z)Ay ~ ileal pe a ae: = 
e1(Aa, Ay, Az)Ax + €2(Az, Ay, Az)Ay + 63(Az, Ay, Az)Az where Aang ne sae a= 

EQ= lim 63 = 0. 


lim 
(Aw, Ay,Az)—(0,0,0) (Ax,Ay,Az)—(0,0,0) 


Theorem 11.10. Chain Rule for Euclidean Spaces. Let f :U — R™ be differentiable at p 
and letg: V > R’ be differentiable at g(p), where U is open in R” and V is open in R™. 
Then go f is differentiable at p and D(go f)(p) = Dg(f(p))Df(p). 


Proof. First, there is an R > 0 so that Br(f(p)) C V, and we can choose r > 0 so that 
if |p — x| < r then |f(x) — f(p)| < R (since f is continuous at p), and B,(p) C U, which 
means that B,(p) is contained in the domain of go f. For the remainder of the argument, 
assume that 0 < |h| <r and he R”. 

We also observe that Dg(f(p))Df(p) is a j x n matrix, which is the correct matrix size 
for the derivative. 

By Theorem 11.2, we can find e(h) : U — R™ so that f(p+h)—f(p) = Df(p)h+e(h) on 


h ; 
U, where jim eh) = 0. Likewise, we can find 6(k) : V > R’ so that g(f(p)+k)—g9(f(p)) = 


—0 \h| 

Dg(f(p))k + 6(k) and jim “ = 0. 

Note g(f(p + h)) = 9(f(p) + (f(p + bh) — f(p)). 

Thus, using k = f(p+h)—(p), we can rewrite 9(f(p+h))—g9(f(p)) = Da(f(p))(f(p+ 
h) — f(p)) + 6(f(p + h) — f(p)). 

Hence, replacing the f(p + h) — 
9(f(p + h)) — 9(f(p)) — Da(f(p 
Dg(f(p))(D Aare + e(h)) + 5(f 


(p) with Df(p)h + e(h), see that 
(p)h can be written as 


p +h) — f(p)) — Do(f(p)) Pf (p)h = 


Dg(f(p))(e(a)) + 6(f(p + h) — f(p)) 
Thus les h)) - sls) — Dg(f(p))DI(p)bl — 
DalFtp))ela)|, 16D +») ~ FP) ee ere 
(h| | [| ) -* 
Dgo(f(p))(e(b)| — [Pa F(p))II(eCa))| 
h = h 
Let i > 0. Since 6(0) i - it is true that when f(p +h) — f(p) = 0, we know that 
d(f(p +h) — f(p))| 


ih =0<5. If f(p +h) — f(p) £0 then 


285 


\6(f(p +h) — f(p))IIf(p +b) — f(p)| _ 16(f(e + b) — F(p))I 
: 


|f(p +h) — f(p)) ne 7 [hy 
Since we know that lim — = 0, we can find t € (0,r) so that if 0 < |h| < ¢ then 
> 
IDo(F(p))Ilte(h))| 
| 2" 
Next observe that |f(p +h) — f(p)| = |Df(p)h + e(h)| < |Df(p)||b] + |e(a)|. Since 
h h 
lim le(b)| = 0 we can find s € (0,t) so that if |h| < s then ity) < 1, which means that 
h-0O |h| \h| 
f(p+h)— fp 
Apr TO < (re) +1. 
Choose 0 < 6 < R so that if 0 < |k| < 6 then 16()| <i 2 . Choose 
[| 2(|Df(p)| + 1) 


0<a<-sso that if 0 < |h| < a then |f(p+h) — f(p)| < 6, so it follows that 

Ig fp + b)) — 9(f(P)) — Da F(P)DA(P)hl — |\Da(f(p))(e(a))| , 8Cf(p + b) = fe) 
|| ~ | (| 

ti at = nclude tha 

5+ N|Ditp) +1 lPFP) +1)=-7. We conclude that 

im (9(f(P + b)) — 9(F(P)) — Do(F(P)) DF (p)hI 

h-0O \h| 

desired. 


< 


= 0, so D(go f)(p) = Dg(f(p)) DF (p) as 


One helpful consequence of the Chain Rule is a way of simplifying implicit differentiation. 
Definition 80 


If x € R* and y € R’ we use the notation (x, y) = (21, ©2, -.-, 255 Yl, Y2s «5 Yt) € 
R**'. The graph of F(x) = k is the set of solutions of the equation F(x) = k. The 
graph of a function z = f(x) over a domain D C R” is {(x,z) € R"t|z = f(x) and 
x € D}. We say a set S C R"*! is locally the graph of a function near (or at) ap if 
there is an € > 0 so that B.(p) 7S is the graph of a function of n variables, meaning 
that itis the graph of afunttion 27 = /(01,09,2:5 27-15 241454221) tor some l= 
j <n+1, in which case we say that S (or an equation whose solutions are S) defined 
G, as 2 TUMCHOM Ol Biot 1 epi ea Weak (Gr ab) Xp — 409, en) 
or on the ball of radius € about x9. We would say that S is locally the graph of a 
function if it is locally the graph of a function at every point of S. Similarly, we 
would say that S is locally the graph of a differentiable or C* function at xo of the 
function f is differentiable or C*. 

We say aa vector n € R”*? is normal to S' (or orthogonal to S$ or perpendicular 
to S) at a point p € S which is a limit point of S if for every sequence {x;} C S so 
that {x;}— p it is the case that i — I 

= 
we say that P = {x € R"*'|n- (x — p) = 0} is the tangent hyperplane to S at p. If 
n = 2 then we refer to P as the tangent plane to S at p. 


-n} > 0. If n is normal to S at p then 


286 CHAPTER 11. DIFFERENTIATION IN HIGHER DIMENSIONS 


Theorem 11.11. Let S be the graph of z = f(#1,%2,...,0%n) for a function f : D> R 
on some domain D C R” containing the ball B.(c) so that f is differentiable at c. Then 
the vector n =< fr, (€), fro(e),---) fr, (€), -—1 > is normal to S at (ec, f(c)) = p and has 
corresponding tangent hyperplane P = {a € R"*'|n- (a— p) = 0}. 


Proof. Let {x;} be a sequence of points in S \ p which converges to p = (c, f(c)). Then 

for each natural number i we can find h; so that x; = (c + hi, f(c + hy)). Since f is 

f(c +h) — fe) -—Vf(c)-h 
[h| 

(c+hi) — f(c) — Vfle) +h 
[h;| 

(c + hj, f(e + hj)) it follows that x; — p = (hy, f(e + h;) — f(c)), so {~ _P - nj} = 


Ixi— pl 
Vf(c + hi) - hi — (f(e + hi) — f(e)) 
t |h;| 


= 0, so by the sequential 


differentiable at c we know that lim 
h-0 


characterization of limits we know that (f } + 0. Since x; = 


} + 0. The result follows. 


Theorem 11.12. Let S be the graph of F(x) = k, where S is locally the graph of a 
differentiable function z = f(x) at p = (ao, f(ao)) € R™*, where a = (21, £2,123, .-., 2m) € 
Ox; Fr; (a0) 

R™”. Then %) = -— if Fr,(ao) A 0. 

Daz, | ) he) f Fy;(@o) # 

Proof. Using the chain rule, we differentiate both sides of F(x,y,z) = k with respect to 


xj, treating x; = x; as a function of 2;, and x; as a function of x; and other variables 


OLw 
| = 0 since all other ~—— terms are zero if ry is 


as constants, we get 


Ox; Ox; Ox; On On; 
constant. Also, me = 1. Solving for a (at xo) gives us that oF (xo) = Ga) =. 


The conditions for which the independent variable is a function of the other variables 
locally is discussed in the Implicit Function Theorem. First, we should establish a few more 
preliminary differentiation laws, however. 


Theorem 11.13. Let f,g: U — R”, where U is open in R”, f and g are differentiable at 
peu, and f(x) = (fi(2), fo(@),..., fm(@)), and g(x) = (91(2), 92(@), -.-,9m(a)). Then: 
(a) Sum rule: D(af + 6g)(p) = aDf(p) + BDg(p). 
(6) Dot product rule: D(f -9)(p) = 9(p)Df(p) + f(p)Da(p). 
(af + 69)(p +h) — (af + 69)(p) — aDf(p)h — BDg(p)h _ 
[| 
g(p +h) — g(p) — Dg(p)h 


Proof. (a) We note that iim 
f(p +h) — f(p) — Df(p)h 


jim oa ih| + jim 6 ih = 0, which implies 
the desired result. F 5 . Pe ; 

(b) We take lim f(p+h)-g(p +h) — f(p)- ues g(p)Df(p)h — f(p)Dg(p) 
= jim P+) -(f(p +h) —f(p)—Df(p)h) + jim yf) -(g(p +h) — g(p) — Dg(p)h) 


1 
az jim Tees (g(p +h) —g(p)), assuming that each of these limits exists. The first two 
=> 


287 


summands are limits equal to zero since g, f are continuous and differentiable at p. The 


1 D h 
third is also zero since PPP )b: (g(p+h)-—g(p))| < | a | |\g(p +h) — g(p)| by the 
definition of operator norm and the Cauchy Schwarz Inequality, and since g is continuous 


at p it follows that Jim |9(p+h)—g(p)| =0. Thus, D(f-g)(p) = 9(p)Df(p)+ f(p)Dag(p). 


There are a variety of different restrictions we can place on curves that help us to refer 
to them. Some books do a good job of distinguishing between the parametrization for a 
curve and the curve (the image of that parametrization) itself. We are going to take the 
approach that in some contexts a curve could be referring to the parametrization function 
whose image is the curve, and in other contexts we might mean the image (or trace) of 
the curve (the curve itself and not the function generating the curve). This intentional 
ambiguity is because of notation commonly used for line integrals. We will use notations 


such as | F - dr which we are not going to define yet. But the “C” part of that notation 
Cc 


refers to a curve. We want to talk about points on the curve (which means points on its 
trace), but the integral is not determined by the curve itself. Its orientation is needed, which 
means we are really saying the integral is dependent on the parametrization for the curve, 
meaning that we are using the same symbol, C, to refer to both the parametrization and to 
its image. This is a nuisance, and is similar to the problem of vector functions sometimes 
having outputs that refer to points, and vectors sometimes referring to matrices which are 
row or column vectors. 


Definition 81 


Let r : D — R” be a continuous function whose domain is an interval D. We 
refer to this function (or its image depending on context) as a parametrized curve or 
just a curve C’. When we want to be clear that we are specifically referring to the 
image of a parametrized curve r and not the function r itself, we refer to r(D) as the 
trace of C’. Likewise, when we wish to be clear that we are referring to the function r 
and not to its trace, we call r a parametrization for C. A parametrized curve whose 
domain is a closed interval is also called a path. 

If the parametrized curve r : (a,b) > R” has a trace which, if intersected with 
some B,(r(to)), is also the graph of a function y = f(x) or x = f(y) for differentiable 
functions f; or fo then the trace of the curve is locally a differentiable function graph 
near r(to), and a tangent line to the trace of r((a,b)) exists. In the case where 


x’ (to) = 0 the tangent line is vertical (but still exists even though = does not). 
iz 


There are multiple parametrizations for the same curve. For instance, the circle in 
example (a) can be traversed by r(t) =< a+ Rcos(2t),b+ Rsin(2t) >, 0 < t < a, which 
is a parametrization tracing out the same curve counterclockwise at twice the speed of the 
former parametrization. We would like to formalize notions of speed and acceleration of a 
particle moving along a parametrized path, motivating the following definitions. 


288 CHAPTER 11. DIFFERENTIATION IN HIGHER DIMENSIONS 


Let r(t) =< 21(t), vo(t),..., a(t) > be a parametrized curve defined on an interval [. 
We should observe that for each point to € J, im P(t) =< 25. 29,4. 2p SP and only if 
—to 


iim ax;(t) = 2; for each 1 <i <n by Theorem 10.17. Similarly, we know that r’(t) exists if 
—to 


and only if x(t) exists for each i, in which case r(t) =< a(t), 79(t), ..., 21,(t) >. 


sin(3t) e6 -1—t cos(t)—1 


E le 11.1. Find li : 
xample mn rane < , ; ; Pp > 
Solution. We can just take the limit of each coordinate and use L’Hospital’s Rule. For 
in(3t 3 3t 

the first coordinate we get iy SD = lim S8osst) = 3. In the second coordinate, 

; = to0)—COt t30 1 24 se 
_l-t = _ _ 

lim eae lim = 0 and in the third coordinate lim ow) = lim amy = 

+0 t t0 +0 t? t+0 =62t 
— t 1 1 

lim ol = —-. Thus, the limit is < 3,0,-—= >. 

t30 2 2 2 


Example 11.2. Find the derivative of r(t) =< t,e',t? +1> at (0,1,1). 


Solution. Setting the first coordinates equal to each other we see t = 0. r’(t) =< 1, e', 3t? >. 
Setting t = 0 we see r’(0) =< 1,1,0>. 


Theorem 11.14. Let F : U > R be a differentiable function, where U is an open subset 
of R”, and let a be a point on the graph S of F(a) = k. Let r(t) be a differentiable 
parametrized curve whose trace is in S so that r(to) = a. Then: 

(a) r'(to) is orthogonal to VF (29). 

(b) If S is locally the graph of a differentiable function x, = f(x1,%2,...,Ln—1) then 
VF (ao) is normal to S' at ao. 

(c) If S is locally the graph of a differentiable function x, = f(x%1,¥2,..-,%n—1) then if 
for every differentiable parametrized curve r(t) whose trace is in S so that r(to) = a it is 
true that r'(to) is orthogonal to a vector v then v= AVF (aq) for some number X. 


Proof. (a) Since F'(r(t)) = k, we can use the chain rule to differentiate both sides, leaving 
us with VF(r(to)) - r/(to) = 0. Thus, r’(to) is perpendicular to VF (xo). 

(b) Next, assume S is locally the graph of a differentiable function x, = f (x1, X2,...,%n—1)- 
Let xo = (€1,€2,..-,€n), and let ¢ = (c1,¢2,...,Cn—1), so f(C) = Cn. Then choose « > 0 
so that B.(xo) MS is the graph of f, a function which is differentiable on some domain 
D = {(21, 22, ..,2n—1) € R"1|(21, 2, ...,2n) € Be(xo) M.S}. By Theorem 11.11, we know 
that n = (Vf(c),—1) is normal to B.(xo) NS at xo. However, since there is no sequence 
of points in S which converges to x9 which is contained in the complement of B.(xo), this 
also means that n is normal to S' at xo. 

Since F(x) —k = f(x, %2,...,%n—1) — Zn = 0 on B.(x0) NS it follows that VF'(xo9) = 
(vf(c),-1) =n. 

(c) Assume S is locally the graph of a differentiable function x, = f (#1, 22, ...,@n—1) 

n—-1 


on some rectangle [[@ —¢€,2; + €) for some « > 0. For each 1 < i < n—1 let 
i=l 


289 


r;(t) = (c1, Coy Cai ue Cis GF EL Ce hay ngdtaeres wip Cj 15 Cp FE GA, ners )) for t € 
(—e,e). Then r,(t) C S, r;(0) = xo, and r,(0) = (0,0,...,0,1,0,0,...,0, Vf(c) -e;) = 
(0,0,...,0,1,0,...,0, fr;(c)). Since each r/(0) is perpendicular to v = (v4, V2,...,Un), it must 
follow that each v- r/(0) = 0 and therefore v; = —unfir,(c). Hence, 

v = (-Unfe,(X0), —Un fra (Xo), ---) —Unfan_1(X0);Un) = —Un(Vf(c),-1), which means that 
V = —UnVF (xo). 


Theorem 11.15. The Mean Value Theorem for Real Valued Functions. Let f : U > R 
be a differentiable function, where U is an open set in R” which contains a line segment 
L(a,b), where aF b. Then there is a point c € L(a, b) which is not equal to a or b so that 
Vf(e)-(b— a) = f(b) — fla). 


Proof. Define g(t) = f(a+t(b—a)), where t € [0,1]. Then g is continuous on [0,1] and 
differentiable on (0,1) by the chain rule, so by the Mean Value Theorem (for one variable), 
there is a point d € (0,1) so that g’(d)(1 — 0) = g(1) — g(0) = f(b) — f(a). By the chain 
rule, g'(d) = Vf(a+d(b —a))-(b—a) =Vf(c)-(b—a), where c=a+d(b-—a). 


Theorem 11.16. Let U be a connected open subset of R” and let f : U > R be a 
differentiable function so that V f(a) = O for each a € U. Then for some number k it 
is true that f(a) =k for allae U. 


Proof. Let a,b € U. Then by Theorem 10.47, there is a polygonal path P(a, b) = L(a,x,)U 
D(x1,xX2) U...U L(xm_1,b) C U. For any line segment L(p,q) C U we can find some c € 
L(p,q) so that f(q)—f(p) = Vf(c):(q—p) = 0 by the Mean Value Theorem for Real Valued 
Functions, which means that f(q) = f(p). Thus, it follows that f(a) = f(x1) =... = f(b). 
Hence, f(x) = f(a) =k for allx € U. 


Theorem 11.17. The Mean Value Theorem for Vector Valued Functions. Let f : U + R™ 
be a differentiable function, where U is an open set in R” which contains a line segment 
L(a, b), where a 6, and let v€ R™. Then there is a point c € L(a, b) which is not equal 
to a or b so that v- Df(c)(b— a) = vu: (f(b) — f(a)). 


Proof. Define g(t) = v-(f(a+t(b—a))), where ¢ € [0,1]. Then g is continuous on [0, 1] and 
differentiable on (0,1) by the chain rule, so by the Mean Value Theorem (for one variable), 
there is a point d € (0,1) so that g/(d)(1 — 0) = g(1) — g(0) = v- (f(b) — f(a)). By the 
chain rule, g'(d) = v- Df(c)(b—a), where c = a+d(b—a), which completes the proof. 


Theorem 11.18. Let f : U + R™ be C', where U is an open set in R", and let K CU be 
compact. Let L(x, y) C U and let |Df(z)| < M for all z€ L(x, y). Then |f(x) — f(y)| < 
Ma — yl. 


290 CHAPTER 11. DIFFERENTIATION IN HIGHER DIMENSIONS 


Of ( 
noe 


If K is a compact subset of U and L(a,y) C K, and M = max vm So | x)|, 


w=1 j=1 
then |f(a) — f(y)| < Mla— yl. 


Proof. First, let L(x, y) c U. If f(x) = f(y) then the result is immediate. Assume f(x) 4 

f(y). By the Mean Value Theorem for Vector Valued Functions, using v = f(y)— f(x) from 

the theorem statement, we can find a point c € L(x, y) so that (f(y)—f(x))-Df(c)(y—x) = 
2 f(y) = FIP F(o)lly — x| 


Next, assume L(x,y) C K, a compact subset of U. Since f is C! we know that 


m n 
Of: 
vm y y | oh | is continuous and has a maximum value M on K by the Extreme Value 
ie 
1=1 91 J 


Theorem. By Theorem 10.32, |Df(c)| < Vm Y/R) |S max vm S| A (6) 


Ox 
i=1 j=1 w=1 g=1 J 


on K, so |f(x) — f(y)| < M|x— y]. 


It is also helpful to have a higher variable form of Taylor’s Theorem. Unfortunately, 
multivariable Taylor series are more cumbersome than single variable series, but they are 
still quite useful. 


Definition 82 


Let f : U > R bea C* differentiable function, where U is an open subset of R” 
which contains the line segment L(x,x +h). We define the kth order differential of 


oe at x with displacement 2 to be 


as Soe Ay me re dx; ay eae ere 


ii—l5—Ilk t= 


Notice that D” f =z x)h; = Vf(x)-h, which is the standard differential 


mA 
approximation df fe the ene in - Also, notice that if f is C* then DY (D (h D) FG, h))(x, h) 


n n n (k—-1) 
= of oy 3 =o all (x)hi,_, Riz _y+--hi, hi, = D™ f(x, h). 


aii, ATE 5 AL5, 


ip=l ay=lig=l tp_y=l 


This allows us to generalize Taylor’s Theorem to n variables as follows: 


Theorem 11.19. Taylor’s Theorem for R". Let f : U > R be a C*t! function, where 
k 
loa 
U is open in R” and L(a,a+h) C U. Then f(e#+h) = f(a) + y qD fe, h) + 


1 


eepie Pile, h) for some point c € L(x, a+ h). 


291 


Proof. Let g(t) = f(x+th) ont € [0,1]. Then g/(t) = Vf(x+th)-h and g'(0 Vs D f(x,h). 
Likewise, g’(0) = D®) f(x, h) and so on, and, more generally, g(t) = D® f(x + th,h). 
Since g has k + Ist derivative ponerse on [0,1] we can use Taylor’s aie in one 


()( 
variable to get g(1 +e g + oP — 0)**1 for some point 
d € (0, 1]. 
k 
1), 1 
Thus, f(x +h) = f(x) + S- =D” F(x, h) + ——_ D*+) fe, h) for some point c = 
—u ! 


x+dhe L(x,x+h). 


Definition 83 


Let f :U > R bea function, where U is an open subset of R” and p € U. We say 
that (p, f(p)) is a local maximum for f if there is some € > 0 so that f(x) < f(p) for 
all x € B <(p). We say that (p, f(p)) is a local minimum for f if there is some € > 0 


so that f(x) > f(p) for all x € B.(p). We say that (p, f(p)) is a local extremum for 
f if (p, f(p)) is a local maximum or a local minimum for f. We say (p, f(p)) is a 
saddle point for f if Vf(p) = 0, but (p, f(p)) is not a local extremum of f. We will 
refer to a point p where Vf(p) = 0 as a critical point for f. 


Theorem 11.20. Let f : U > R be a function which is differentiable at p € U, where U 
is open in R” and (p, f(p)) ts a local extremum. Then V f(p) = 0. 


Proof. Let i be an integer so that 1 <i <n. Define g(t) = f(p+te;). Then g takes on a 
local extremum at t = 0 since g(0) = f(p). Since g is differentiable at zero (by the chain 
rule), it follows that g’(0) = fz,(p) = 0. Since this is true for all 1 < i < n it follows that 


Vf(p) = 


This is helpful in finding extrema but it is not sufficient by itself, because, just as with 
single variable functions, a multivariable function may have saddle points. For example, 
f(x,y) = 2? —y? has both partial derivatives equal to zero at (0,0) but (0,0,0) is not a 
local extremum for this function. 


We might say a point is a saddle point for f or of f or for the graph of f or of the graph 
of f, and all are acceptable, and it is quite normal to just refer to such a point as a saddle 
point if the function or surface it is a saddle point of is understood. We look to additional 
tests to identify whether a point is a local extremum once we have found all critical points 
of a function. We can generalize a form of the first derivative test and a form of the second 
derivative test to dimensions higher than one. 


The following theorem generalizes the first derivative test. 


292 CHAPTER 11. DIFFERENTIATION IN HIGHER DIMENSIONS 


Theorem 11.21. Let f: U > R be a differentiable function, where U is open in R” and p 
is a critical point of f. Then the following are true: 

(a) The point (p, f(p)) is a local maximum for f if there is an € > 0 so that for all 
0<t<e, for every unit vector u€ R”, the directional derivative Dyf(p+tu) < 0. 

(b) The point (p, f(p)) is a local minimum for f if there is an € > 0 so that for all 
0<t<e, for every unit vector u€ R”, the directional derivative Dyf(p+ tu) > 0. 


Proof. We prove (a) first. Let x € B-(p). Then we can choose a unit vector u and to € (0, €) 
so that p + tou = x (specifically, u = = oe and to = |x — p|). Define g(t) = f(p+ tu). 
x—p 


Then for all 0 < ¢ < €, we know that g’(t) = Vf(p + tu) - (u) = Duf(p+tu) < 0. Thus, 
g is non-increasing on [0,¢€) which means that g(to) < g(0). Therefore, f(x) < f(p), which 
means that (p, f(p)) is a local maximum for f. 

The proof of (b) is similar, or just observe that by negating a function f satisfying the 
hypotheses of (b) we have a function —f satisfying the hypotheses of (a), and if (p, f(p)) 
is a local maximum for —f then (p, f(p)) is a local minimum for f. 


There is also a generalization of the second derivative test. A more convenient generalization 
of this test works in R?. Both are outlined in the theorem below. 


Theorem 11.22. Second Derivative Test for multivariable functions. 

(a) Let f : U +R, where f is C? on the open set U in R”. If Vf(p) = 0 for some 
peu and DyDuf(p) > 0 for all ue R” so that |u| = 1, then (p, f(p)) is a local minimum 
for f. If DuDuf (p) <0 for all w€ R” so that |u| = 1, then (p, f(p)) ts a local maximum 
for f. If there is some unit vector u so that DyuDuf(p) < 0 and another unit vector v so 
that DyDvf(p) > 0 then (p, f(p)) is a saddle point for f 

(b) Let f :U +R, where f is C? on the open set U in R?. Let p be a critical point of f. 
Let D = fax fyy — cane If D(p) > 0 then (p, f(p)) is a local maximum for f if frr(p) <0 
and (p, f(p)) ts a local minimum for f if frr(p) > 0. If D(p) < 0 then (p, f(p)) is a saddle 
point for f. 


Proof. (a) Choose a 6 > 0 so that Bs(p) C U. We define gu(t) = f(p+tu). Using the 
chain rule, have that g(t) = DuDuf(p + tu) = D®) f(p + tu, u) (as discussed with the 
proof of Taylor’s Theorem for R”), which is continuous since all second partial derivatives 
of f are continuous. 

Setting G(x, u) = DyDuf (x) (a function whose domain is in R”” where (x, u) represents 
the vector whose first n coordinates are those of x and whose second n coordinates are those 
of u), we note that this is a continuous function over all u € R” and x € Bs(p). If we let 
D = B,(0) in R”, the closed radius two ball about the origin, then H = {(x,u)|x € B5(p) 
and u € D} is a closed and bounded set in R?". Thus, G is uniformly continuous on H by 
Theorem 10.35. 

Since C = {(p,u)||u| = 1} is closed and bounded, |G(x, u)| takes on a minimum value 
m on C by the Extreme Value Theorem. Assuming that G(x,u) 4 0 on C, then m > 0. 
Hence, if G is positive on C then G(p, u) > m on C, and if G is negative on all of C then 
G(p,u) < —m on C. 

Since G is uniformly continuous, we can find a 0 < y < 6 so that if |(x, u) — (y,v)| < 7 
then |G(x, u) — G(y, v)| < Thus, for all x € B,(p), if |u| = 1 then |G(x, u)| > = 


293 


By the second derivative test in one variable, gy(t) takes on a strict local maximum at 
t = Oif gi(0) = DuDuf (p) < 0 and astrict local minimum if DyDu(p) > 0. Furthermore, if 
G is negative on C then for all x = p+tu € B,(p) it follows that gf (t) = DuDuf (x) < ae 


which means that g/,(t) < 0 for each unit vector u and all 0 < t < ¥ (since the first derivative 
is decreasing on (0,7) and is zero at t = 0). Thus, by the Theorem 11.21, the point (p, f(p)) 
is a local maximum for f. Likewise, if G is positive on C' then by a similar argument (p, f(p)) 
is a local minimum for f. 

If there is some unit vector u so that DyDuf(p) < 0 and another unit vector v so that 
D,yDyf(p) > 0 then this means that for sufficiently small values of t > 0 it is true that 
gu(t) > gu(O) and gy(t) < gu(0). Hence, there are points p + tu and p + tv which are 
arbitrarily close to p so that f(p+tu) > f(p) and f(p+tv) < f(p), so (p,f(p)) isa 
saddle point for f. 

(b) Let u = (Az, Ay) € R? be a unit vector. We, again, define gu(t) = f(p + tu). 
Using the chain rule, have that gi(t) = Vf(p+tu)-u = f,(p + tu)Az + f,(p + tu) Ay. 
Thus, gu(t) = (fex(p + tu)Az + fye(p + tuj)Ay)Az + (fry(p + tu)Ax + fyy(p + tu)Ay)Ay 
= frx(p+tu)(Av)? +2fry(pt+tuj)ArAy +t fy, (p+tu)(Ay)? by Clairaut’s Theorem. Thus, 
gu (0) = DuDuf (Pp) = fre (p)(Aa)? a 2fry(p)AvAy 3 fyy(p)(Ay)?. 

We recall that for a quadratic equation at? + bt + c, there are no real zeroes for this 
equation if b? — 4ac < 0, and there are two real zeroes for this equation if 6? — 4ac > 0. This 
fact comes from the quadratic formula, which can be proven by completing the square (but 
we will assume this is known from an earlier algebra course). For now, we will assume that 


Ay # 0 (the proof is similar if Ax 4 0). 
A 
For t = x we can rewrite gi(0) = DuDuf(p) = fee(p)(Ar)? + 2fey(p)ArAy + 


ov] 

fuy(p)(Ay)? = (Ay)? [fex(p)t? + 2fey(p)t + fyy(p)]. Since (Ay)? > 0, the sign of gf{(0) 
is the same as the sign of fre (p)t? + 2fry(p)t + fyy(p). However, this is a quadratic 
expression in ¢ and it has two zeros if 4(fry(p))? — 4frx(P)fyy(p) > 0 and no zeroes if 
A(firy(P))” — Afra(P)fyy(P) < 0. Hence, if D(p) = fea(P)fyy(P) — (fry(P))? > 0 and 
fex(p) > 0 then that means frx(p)t? + 2fry(p)t + fyy(p) > 0 for all t (since this equation 
has no zeroes and yields a positive value when t = 0 because fz2(p) and fy,(p) must have 
the same sign and be non-zero if fre(P)fyy(P) — (fry(P))? > 0). Likewise, if fre(p) < 0 
then frx(p)t? + 2fry(p)t + fyy(p) < 0 for all t. Hence, if D(p) > 0 and fr2(p) > 0 then 
DuDuf(p) > 0 for all unit vectors u, so by part (a) we know (p, f(p)) is a local minimum 
for f. Similarly, if D(p) > 0 and frz(p) < 0 then DuDuf(p) < 0 for all unit vectors u, so 
by part (a) we know (p, f(p)) is a local maximum for f. 

In the case where D(p) < 0, the quadratic equation given has two zeroes and is therefore 
negative for some value tg of t and positive for some value t; of t. If we wish to be 
more specific, by picking angles 09,6; so that cot(@9) = to and cot(@,) = t;, we can set 
Azo = cos(9%), Ayo = sin(09), Ax; = cos(6,), and Ay; = sin(6,). Then up = (Azo, Ayo) 
and u; = (Az, Ay;) are unit vectors so that Dy, Du, f(p) > 0 and Du, Du, f(p) < 0, so by 
part (a) again, we know that (p, f(p)) is a saddle point for f. 


Example 11.3. Find all local extrema and saddle points for f(x,y) = a* —daey+y* +1. 


294 CHAPTER 11. DIFFERENTIATION IN HIGHER DIMENSIONS 


Solution. Setting partials to zero we have f, = 4¢° — 4y = 0 so y = 2°, and fy = —4e + 
4y? = 0 so = y®. Thus, (0,0), (1,1), and (—1,—1) are critical points. We then evaluate 
fee = 22", fry = —4and fyy = 12y?. Thus, if D= fre fyy—(foy)? then D(O,0) = =16 <0 
so there is a saddle point at (0,0,1). Since D(1,1) = 128 > 0 and f,.(1,1) = 12> 0, f has 
a local minimum (1,1,—1). Since D(—1,—-1) = 128 > 0 and fz,(—1, -1) = 12 > 0, f also 
has a local minimum (—1,—1,—1). This function has no local maxima. 


The Inverse Function Theorem and the Implicit Function Theorem are important for 
many theorems. 


Theorem 11.23. The Inverse Function Theorem. We use variable notation 11 = ©, x2 = 
y,73 =z. Let f = (fi, fo, fn):U 3 R” be aC" function so that det Df(p) 4 0, where 
pew, an open in R”. Then there is an r > 0 so that: 


3) 
(a) The function dot( oe (ci)|nxn 1s a continuous function of the entries c; for1 <i<n, 
v5 


Ofi 


(ci)|nxn| > m for all ¢; € B, B,(p) yi 
On; 


and for some m > 0 it is true that | det| —— 


(b) f is one to one on B,(p) 

(c) f(B,(p)) is open and f restricted to B,(p) has inverse f~' which is continuous on 
f(B,(p)). 

(d) If we restrict f to B,(p) then f—! ee SiS 1. fy!) : f(B,(p)) 2 R” is Ct 

(e) If we restrict f to B,(p) then f~* is C1 on f(B,(p)), and Df~+(f(x)) = (Df(a))7! 
for all « € B,(p) 


Proof. (a) The determinant is a sum of constants times products of continuous functions of 
2 
the entries over the n-fold Cartesian product of U in R” , and is thus continuous. 


Ph Yen = det Df(p) 4 0, and det 22" 


Since det| (c;)|nxn is continuous at (p, p,..., P) 
Ox; Ox vy 
(the vector with entries in p listed n times), we can find a 6 > 0 so that Bs(p,p,...,p) C 
; det D 
UxUxU... x U and | det Df(p) — det [5 Of (Callieal = {det DIP) for all (cy, €2,...,€n) € 
a 
i det D 
Bol, P, --:p)- Thus, | det{ 94 (6;)jnen| > 182 APH for alt (61,62, -s€n) € Bs(P,Ps--sP) 
Ug 


0 ——— 
inR”. Choosing r € (0, —=) we notice that if c1,c2,...,¢n € B,-(p) in R” then |(c1, ca, ..., en)— 


Jn 


yee Se elas ee ldet{5 fi e (eal > oy EDN 
i=1 vn 2 

(b) Let y,z € B,(p), where y = (y1,y2,-.,Yn) and z = (21, 22,..,2n). Since each f; is 
continuous, by the Mean Value Theorem for Real Valued Functions, for each 1 <i <n 
we can find c; € L(y,z) C B,(p) so that Vfj(c;)-(z—y) = fi(z) — fily). Suppose 
fi(z) — fily) = 0 for each 1 <i <n. Choosing a c; satisfying V f;(c;) - (a2 — y) = 0 for each 
1<i<n, gives a system of n pauauons in n variables (z;— y;) whose coefficient matrix has 


(p, Pp, ect) p)| = 


a non-zero determinant det|—*(c;)]nxn and therefore the system has a unique solution by 


Ox; 


Cramer’s Rule. 


295 


By part (a) the determinant of the coefficient matrix is non-zero, so the the only solution 
is z; = y; for each 7, meaning that if f(z) = f(y) then z= y, so f is one to one on B,(p). 

(c) Let w € B,(p). Since f is one to one on B,(p), the function g : B,(p) > R defined 
by g(x) = |f(w) — f(x)| is positive on 0(B,(p)). Since 0(B,(p)) is closed and bounded, 
there is a least value | of g on O(B,(p)). In other words, for all x € 0(B,(p)), the distance 
between f(x) and f(w) is at least J. 

Let q € Bi (f(w)). We then define a function hg : B,(p) > R by hg(x) = |q — 


f(x)|. Since B,(p) is compact and hg is continuous, is must follow from the Extreme Value 


Theorem that hg takes on a minimum value at some point z € B,(p). Since hg(w) = |q— 


f(w)| < : and for every point b on the boundary of B,(p) we know that |f(b)— f(w)| > J, 


it is impossible for z to be a point on the boundary of B,(p) since, by the Triangle Inequality, 
i 
hq(b) = |F(b) — al > | (b) — f(w)| — [F(w) — a] > 5- Thus, we know that % € B-(w). 


Since hg has a minimum at z, it also follows that (hg)? has a minimum at z. By Theorem 
11.20, it must follow that V(hq)?(z) = 0 (since z is in the interior of the domain of hia). 


(G1, 42; ++59n). Since has) a So (fi(x) — qi)”, by the chain rule we have 
i=l 


V(hq)?(2) = (S5 2(fi(z) — 91) fin, (2), 2(Fil@) — i) fing (@)s > 2 Fi(@) — 4) Fig, (2)) = 


i=1 i=1 i=1 


Let q 


(0,0,0,...,0). This gives the system of equations So (fi(2) — di) fie, (2) = Otorl <7 <1: 
i=1 

Hence, treating the (f;(z) — q;) terms as the variables, the coefficient matrix is Df(z) 4 0, 

so there is a unique solution by Cramer’s rule, which must be f;(z)—q = 0 for alll <i<n, 

which means that f(z) =q. Thus, Bi(f(w)) C f(B,(p)). Hence, every point of f(B,(p)) 

is contained in an open ball which is contained in f(B,(p)), which means that f(B,(p)) is 

open. 

Next, we wish to show that f restricted to B,(p) has inverse f~' which is continuous 
on f(B,(p)). Let U be an open set in R". Then (f~')~1(U) = f(U) = f(UNB,(p)) since 
f is one to one. For each point x € UM B,(p) we can find an rx > 0 by the argument above 
so that B,,(x) C (UN B,(p)) and f(B,,(x)) is open. Thus, f(U) = U fgeseme.any 

x€UNB;,(p) 
which is open. Thus, f~! is continuous. 

(d) Let f(xo) = yo € f(B-(p)) and choose R > 0 so that Br(yo) C f(B-(p)). Let 
0 < |t| < R. For each 1 < k <n there is a unique x(,4) € B,(p) so that f(x(;,4)) = yo + tex. 
By the Mean Value Theorem for Real Valued Functions we can find, for each 1 <7 < n some 


ee) FilX(xt)) — filo) 


c; € L(x, X(%,4)) so that 


; , which is equal to zero 
if # k and equal to one if i = k. This creates a system of equations with coefficient matrix 


determinant det/—— * (ci)Inxn which is non-zero. By Cramer’s Rule, for each 1 < m <n, we 


Ox; 


RR 0 tin’ (Yo + ten) — fin (Yo) 


, which is , which is 


can then solve for mth component of 


equal to a ratio of determinants whose denominators are non-zero ain whose matrix entries 
i (Yo + tex) — tas (Yo) m2 det Am 


t det [$Z(c:)Jnxn 


vary continuously with t. More specifically, ; 


296 CHAPTER 11. DIFFERENTIATION IN HIGHER DIMENSIONS 


Ox; 
As t > 0, the points c; + xo by part (c) since f~! is continuous. Thus, if we take the 
fn (Yo + tex) — fin (Yo) _ Ofm'(¥o) _ det Bm(xo) 

t Oy, ~——s det Df (xo) 
by replacing the mth column of Df (xo) by em. 

Let By,(x) be the matrix obtained by replacing the mth column of Df (x) by em. The 
det Bm(x) _ Ofm'(y) 
det Df (x) O”K 
continuous on f(B,(p)), and both det B,,(x) and det Df (x) are continuous functions of x, 
where det Df (x) 4 0 on f(B,(p)). Thus, each of the partial derivatives of f~! is continuous. 

(ec) The derivative of the identity function g(x) = x on R” is the identity matrix 
In, consisting of one entries on the diagonal and zero entries everywhere else. This can 
be seen by simply taking the partial derivatives directly. Since we know f~! is C! on 
f(B,(p)), for each x € f(B,(p)), by the chain rule we know that D(f o f~')(f(x)) =In = 
Df(f~*(f(x0))Df7'(f(xo)). Hence, it follows that Df~!(f(xo)) and Df (xo) are inverses 
of each other (by Theorem 14.12). 


where A,, is the matrix obtained by replacing the mth column of [ (Cains DY Gms 


limit lim , where B,,(Xo) is obtained 
—> 


ratio , where y = f(x) varies continuously with y since f~! is 


The following is a minor corollary to the Inverse Function Theorem that can be helpful 
for deciding when topological properties are preserved. This is a sufficiently direct application 
of the Inverse Function Theorem that when we quote this result we may just say “by The 
Inverse Function Theorem.” 


Theorem 11.24. Let ¢: U > R” be C! with Ag #0 on U, an open set in R”. Let V 
be an open subset of U. Then $(V) is open. Furthermore, if ¢ is one to one then ¢ is a 
homeomorphism from U to ¢(U). 


Proof. Let V be an open subset of U. Let ¢(p) € ¢(V) for some p € V. By the Inverse 
Function Theorem there is some € > 0 so that B-(p) C V and ¢(B-) is an open set containing 
(x) which is contained in ¢(U). Hence, ¢(V) is open. 

If ¢ is one to one, since we know ¢ is C and therefore continuous, and (¢~')~!(V) = 
#(V) is open for every open V C U, we know that ¢~/ is continuous. Every function is 
onto its own image, so the function ¢: U > ¢(U) is a homeomorphism. 


Presenting the Implicit Function Theorem’s most general form may not immediately 
make sense to some readers, so we plan to proceed along two approaches. First, we we 
present a proof that parallels methods of Courant’s Introduction to Calculus and Analysis 
(volume 2), which proves the theorem for two variables, in which context it is easy to 
discuss. Then we will present generalizations based on the argument included in Wade’s 
Introduction to Analysis to higher dimensions. The hope is that by the time we prove the 
most general version the previous discussion will help us to explain what it means. 


Theorem 11.25. The Implicit Function Theorem for two variable real valued functions. 
Let F : U + R be a C' function, where U is open in R? and (29,yo) = p € U so that 
F (xo, yo) = 0 and Fy (xo, yo) #0. Then there is a rectangle [xo—a, xo +a] x [yo—b, yo +b] C U 


297 


(where a,b > 0) so that for each x € [x9 — a,xo + a] there is exactly one y = f(x) € 
[yo — b, yo +b] so that F(x, f(x)) = 0. Furthermore, f(29) = yo and the function f is C’, 


and f'(x) = iG f(x)). 


Proof. We will assume Fy(xo, yo) = 2m > 0 (the argument if m < 0 is similar). Since Fy is 
continuous we can find a;,b > 0 which are small enough so that Fy > m on [ag — a1, 29 + 
a1] X [yo — 6, yo +b] C U. Since Fy > m it follows that F(x, y) is an increasing function in 
the variable y for each x € [x9 — a1, 20 + a4], so there is at most one y so that F(x, y) = 0 
for any given x € [x9 — a1, 29 + a1]. By the Extreme Value Theorem, since F', is continuous 
on a number [29 — a1, 29 + a1] X [yo — 6, yo +. 5], we can find a number M which exceeds |F’,| 
on [%9 — a1, 2%9 + ai] X [yo — b, yo + OB]. 

For each x € [xo — a1,%0 + a1], by the Mean Value Theorem we can find c so that 
F,(c, yo)(c — yo) = F(x, yo) — F (x0, yo) = F(x, yo). Thus, |F(x, yo)| < May. Furthermore, 
also by the Mean Value Theorem, we can find c, for each x € [xo — ai1,XZ0 + ai] so that 
F(z, yo + 6) = F(z, yo + 6) — F(z, yo) + F(z, yo) = Fy (a, cx) (b) + F(x, yo) > mb — May. 
Likewise, we can find d, € (y+0-— 0,yo) so that F(x, yo — 6) = F(x, yo — 6) + F(x, yo) — 
F(x, yo) = —Fy(x,cr)(b) — F(x, yo) => —mb+ Ma,. Replacing a; by a smaller positive 
number a so that mb— M(a+e) > 0 for some 0 < € < a, we have that F(x, yo +b) > 0 and 
F(x, yo — 6) < 0 for all x € [xp — a, xp +a]. Hence, by the Intermediate Value Theorem, for 
each x € [% —a—€,%9 + ate] there is y = f(x) € [yo — 6, yo +B] so that F(x, f(x)) = 0. 
Since we know that F'(xo, yo) = 0 it follows that f(xo) = yo. 

Let x € [zo — a, 29 + a]. Let |h| < € and set k = f(a +h) — f(x). By the Mean Value 
Theorem for Real Valued Functions we can find a point c € L((x, f(x), (a@+h, f(x) +k)) so 
that VF'(c)-(h,k) = F(a, f(x)) — F(a +h, f(a +h)) = 0. Hence, F,(c)h + Fy(c)k = 0, so 


: = Lee ") = f(2) = F (c). Hence, lim Fa ") aug) = F (x, f(x)). Since this 
Fy 
limit exists, f is differentiable and therefore continuous. Since f is continuous, —# (, f(x)) 


y 
is continuous, and therefore y = f(x) is C' on [x9 — a, 20 +a] x [yo — 6, yo +O]. 


To help see what this theorem means, we look at a circle 2? + y? = 1. The graph 
of this relation is not a function. We can see that y is not a function of x )nor is x a 
function of y since the vertical line test is failed (there is more than one y value for each 
x value apart from those at the left and right ends of the circle). We can write this as 
F(a,y) =2?+y*—1=0. Then Fy = 2y £0 unless y = 0. That means that if y 4 0 then 
we can find a rectangle containing (x, y) on which, locally (ignoring the graph of F’ outside 
the small rectangle), y = f(x) is a function of x. At y = 0 this is not true, and on the 
graph of this circle we can see that no matter how small a rectangle we take about (1,0) or 
(—1,0) the resulting piece of the graph of the circle would still fail the vertical line test (so 
y would not be a function of x). Furthermore, since F' has a slope that varies continuously 
with x and y, the derivative of f’(x) would be continuous over this small rectangle. 

Note that this theorem does not tell us what the function f is, only that it exists, which 
is frequently important for us to know even when we cannot determine /. 


298 CHAPTER 11. DIFFERENTIATION IN HIGHER DIMENSIONS 


Definition 84 


Let F' : R™ > R” where m=n+k, where k EN. If 2,,,...,25, are variables of 
Oe lose toe) pee: 


Oana Ons, 


F and r < n then we denote |rxr- We use the notation 


(x,t) to denote the vector in R"** whose first n coordinates are the entries of vector 


x € R” and whose last k entries are those of vector t € R*. It is understood that if 
we write the dimension of a Euclidean space as a sum of positive integers in this way 
and an element of the space as a pair of vectors then the first vector has a number of 
coordinates equal to the first integer listed in the sum and the second has a number 
of coordinates equal to the second integer in the sum. 


Theorem 11.26. The Implicit Function Theorem. Let F = (F\, F,...,F,) : U > R” be em 

where U is an open subset of R"*® which contains the point (a, to), and let F(a, to) = O, 

OF, Fo,..., F 

s eid a i (ao, to) £0. Then there is an open set B,(to) in R® so that for each 
U1, XQ..., en 

t © B,(to) there is a unique « = g(t) € R” so that F(g(t), t) = O. Furthermore this function 

g is C’ on B,(ty) and g(to) = a0. 

Proof. We begin by defining G(x, t) = (F'(x,t),t) on U. Then the domain and range of G 

are in R"** and det DG(xo, to) = 


where 


OF OR OR OR OR, OF 
Ox, Oxe OLn Ot1 Ot Oty 

OF, OF, OF, OF, OF, Oby 

Ox Ox OLn Oty Oto Oth, 

OF, OF, OF, OF, OF, OF, (xo, to) = OUP, Fay «15 En) (xo, to) #- 0. 
dx, Oxy dt, Ot Oty Oty (a1, £2...,Ln) 

0 0 0 1 0 0 

0 0 0 0 1 0 

0 0 0 0 0 1 


Thus, by the Inverse Function Theorem, there is some R > 0 so that (i) The function 
dete (Ci) |n+kxn+k iS a continuous and non-zero function of the c; entries for c; € Br(xo, to). 

(ii) ‘G is one to one on Br(xXo, to) and 

(iii) G(Br(xo, to)) is open and G'(y,t) = (G7 (y,t), Gy'(y,t),...,Gy'(y,t),t) is C* 
on G(Br(xo, to)). Note that the fact that the last k entries are those of t is not a consequence 
of the Inverse Function Theorem, but a consequence of the definition of G (the only point 
that can map to a point whose last & coordinates are those of t under G is a point whose 
last k coordinates are those of t). 

Since G'(Br(xo, to)) is open, we can find some r > 0 so that B;(G(xo, to)) C G(Br(xo, to)), 
where G(xo, to) = (0, to) by construction, and of course B,(to) is open in R*, and t € B,(to) 
if and only if (0,t) € B,(0, to). 

We define g(t) = (Gy'(0,t),Gz'(0,t),...,G,'(0,t)) for each t € B,(to). Note that 
g(to) = Xo since G(xo, to) = (0, to), so g(to) is the vector whose coordinates are the first n 
coordinates of the point mapping to (0,to), which is xo. 


299 


Observe that F'(g(t),t) is vector whose coordinates are the first n coordinates of the 
image of a point which is the inverse image of a point whose first n coordinates are zero, 
which means that F'(g(t),t) = 0 for all t € B,(to). 


gi OG; 
Next, note that each partial anit) = ar (0,t) because G;*(0,t) = g;(t) by definition 
j J 
i(t nti) — gilt G* (044+ hes) —G= (4 OG. 
of g ao tim 2 + hen+s) Gilt) _ im (OVE Renta) i Dire en “_(0,t) 
h0 h h-0 h Ot; 


varies continuously with t since G7! is C', and therefore g is C' on B,(to). Also, observe 
that (g(t),t) = G~'(0,t) € Br(xo, to) for all t € B,(to). 

Suppose F'(h(t),t) = 0 for some t € B,(to). Then G(h(t),t) = (0,t) = G(g(t),t). 
Since G is one to one on Br(xo, to), it follows that (h(t),t) = (g(t),t) which means that 
h(t) = g(t), 

Thus, we have established all points of the theorem and the proof is complete. 


In some situations it seems to be less intuitive to have the g function have the first 
coordinates as outputs instead of the first coordinates. So, we also note that we could have 
done it the other way. Here is a minor restatement of the theorem. 


Theorem 11.27. Restatement of Implicit Function Theorem. Let G = (G4, Go,...,Gr) : 

U + R* be C!, where U is an open subset of R"** which contains the point (ao, to), and 
O(Gi, Go, seey Gx) 

let G(a9, to) = O, where 

) O(t1, ta..., th) 

R” so that for each x € B,(ao) there is a unique t = g(x) € R® so that G(a,g(x)) = 0. 

Furthermore this function g is C' on B,(a) and g(a) = to. 


(a, to) #0. Then there is an open set B,(ao) in 


Proof. We apply the earlier statement of the Implicit Function Theorem, switching n and 
k and letting G(x,t) = F(t,x) in the earlier statement. 
L] 


An immediate consequence of the Implicit Function Theorem is the following: 


Theorem 11.28. Let F : U + R, where U is open in R” containing p, F is C' and S 
is the graph of F(«) =k. If VF(p) 40 on S then S is locally the graph of a C' function 
at p. In particular, if F,,(p) 4 0 for some i then S is locally the graph of a C function 
p= OR Ba A A he On) OU 


Proof. We will assume that F;,,,(p) # 0 (the other partials being non-zero leads to an 
argument that is similar, only changing variable names). Let p = (c1,c2,...,Cn) and let 
c = (C1, C2,...;Cn—1)- By the Implicit Function Theorem there is an open ball B,.(c) so that 
for every x € B,(c) there is exactly one a, = g(x) so that F(x, g(x)) = k. The function 
g is C' on B,(c). Thus, (B,(c) x R) OS is the graph of g on B,(c), and thus for any 
0<e<rwe know that B.(p)NS is the graph of g on the open set D = {(x1, X2,...,L%n—1) € 
R13 (224, 2, «5 En—15 9(L1, £2, --,Zn—1)) € Be(p) NS}. 


300 CHAPTER 11. DIFFERENTIATION IN HIGHER DIMENSIONS 


Frequently, we wish to find extrema of a multivariable function f : R” > R subject to 
a condition restricting the values in the domain to be considered, which may be designated 
as a constraint that g(x) = k, for some g : R” > R. Assuming that f and g are both 
differentiable functions, it is helpful to use the method of Lagrange Multipliers to achieve 
this. It turns out that at any point p where there is a local extremum of f subject to 
constraint g(x) = k, it will be true that Vf(p) = AVg(p) (for some number X) assuming 
the constraint graph is differentiable, the reasons for which will be explained below. By 
solving this equation and the constraint equation as a system we are able to identify points 
at which an extremum could occur. In most problems in which Lagrange Multipliers are 
used we are looking for an absolute extremum rather than a local extremum (though this 
method does identify local extrema subject to the constraint as well). This usually requires 
an additional step of verifying that there actually is an absolute maximum and an absolute 
minimum at one of the relative extrema. If so, then by comparing the values of the function 
f at the points satisfying the aforementioned V f(p) = AVg(p) system of equations, we can 
identify the largest of these function values as the absolute maximum value and the smallest 
as the absolute minimum value. 


Before we prove the theorem about Lagrange Multipliers we mention that it is fairly 
easy to see the reason for the Lagrange Multiplier equation graphically for a two variable 
function. For a two variable function z = f(x,y) if we use a contour map wherein we sketch 
the graphs of trace curves f(x,y) = h (called contour lines) for many values of h, and on 
the same two dimensional graph we sketch the graph of g(x,y) = k, a constraint function, 
then if the graph of g(x,y) = k passes through a contour line f(x,y) = h; then the graph 
g(x,y) = k would also intersect nearby contour lines for heights ho smaller hg and larger 
than h, (on either side of the f(x,y) = h; contour line). Thus, the maximum of f subject 
to g(x,y) = k could not occur at a point on the curve where the slope of g(x,y) = k is 
different from the slope of f(x,y) = hi (since if the slopes are different the curves will cross 
through one another). However, the slopes of these curves are the same if and only if the 
gradients are parallel, which means that V f(p) = AVg(p) at a point p where an extremum 
occurs. 


301 


Contour Map for z = f(x,y) with Constraint Curve g(x,y) =k 


Before we move further, we should be somewhat more precise about what we mean by 
a local maximum of a function subject to a constraint. 


Definition 85 


Let f : U > R and g: R” > R be differentiable functions, where U is open in 
IR”. We say that a point (p, f(p)) (for some p € U) is a local maximum for f subject 


to the constraint g(x) = k if g(p) = k and there is an € > 0 so that if x € B.(p) and 
g(x) =k then f(x) < f(p). Similarly, (p, f(p)) is a local minimum for f subject to 
the constraint g(x) = k if g(p) = k and there is an € > 0 so that if x € B.(p) and 


The following theorem outlines a proof for Lagrange’s method of undetermined multipliers 
(the number . being referred to as Lagrange’s multipler) when only a single constraint it 
used. 


Theorem 11.29. Lagrange Multipliers with one constraint. Let f:U > Randg:R"” ~R 
be C! functions, where U is open in R" and p€ U and (p, f(p)) is a local extremum for f 
with subject to the constraint g(a) =k, where Vg 4 O. Then for some number X, it is true 
that V f(p) = AVg(P). 


Proof. First, since Vg 4 0, we know that the graph of g(x) = k is locally the graph of a 
differentiable function by Theorem 11.28. Thus, by Theorem 11.14 we know that Vg(p) is 
normal to the graph of g(x) = k at (p, f(p)). 


302 CHAPTER 11. DIFFERENTIATION IN HIGHER DIMENSIONS 


Let r(t) be a differentiable parametrized curve so that g(r(t)) = k and r(to) = p. Since 
f(r(#)) is a one variable function that takes on a local maximum at t = to, it follows that 
(f (r(t))/(to) = 0, so by the chain rule, Vf (r(to))-r’(to) = 0. Thus, by Theorem 11.14, there 
is some X so that V f(p) = AVg(p). 


Example 11.4. Find the mazimum and minimum values of f(x,y, z) = xy + 2” subject to 
the constraint «7 + 2y? + 327 = 12. 


Solution. Both f and g are locally C™ functions at every point, so we don’t need to 
worry about conditions for Lagrange multipliers not being satisfied anywhere. The ellipsoid 
constraint graph is closed and bounded so we know for certain that there will be a maximum 
and minimum because of the Extreme Value Theorem, and so we know that these extrema 
will occur where Vf = AAg and g(x,y, z) = x7 + 2y? + 327. 

Thus, fz = Agz, fy = AGgx and fz = Ag, and the constraint is satisfied where the extrema 
occur. This means the extrema must satisfy the following system of equations: 


y = A(2z) 
x = X(4y) 
2z = X(6z) 


ax? + dy? + 32% = 12 

To solve this system it may be a good idea to try to isolate \ in the first three equations. 
If « = 0 then we cannot divide by 2 since that would be division by zero, so we consider 
multiple cases. If « = 0 then from the second equation we see that either A = 0 or y = 0. 
It is not possible that A = 0 because then x = y = z = O and that does not satisfy the 
constraint equation. Hence, if x = 0 then y = 0, so in the constraint we would have that 
327 = 12, so z = +2. Otherwise, x # 0 which means that \ = x and since \ # 0 we know 


y £0 and thus « £0 and A= = From the third equation we know that either z 4 0, in 
Y 


1 
which case \ = gr Oe 0. Since A = 7 = 7 we have that 4y? = 2x”, so x” = 2y?. If 
a 4y 


z = 0 then 4y? = 12, so y= +V3 and « =+V6. If z ~ 0 then since \ = : we know that 
3y = 2x and 3x = 4y, which is only possible if « = y = 0, which is a case we have already 
covered. Thus, the possible solutions to the system are (0,0, +2), (tV6,+V3, 0). 

We test those solutions in the function f and compare the values to see which are largest 
and smallest to identify the extrema. This gives us: 

F(0,0, 42): =4 

f((V6, V3,0) = V18, meaning (V6, V3, 0, V18) is an absolute maximum for f subject 
to the constraint. 

f((—V6, —V3,0) = V18, meaning (-V6, —V3,0, V18) is an absolute maximum for f 
subject to the constraint. 

f((—V6, V3,0) = —V18, meaning (— V6, V3,0, -V18) is an absolute minimum for f 
subject to the constraint. 

f((V6, —V3,0) = —V18, meaning (V6,—V3,0,-—V18) is an absolute minimum for f 
subject to the constraint. 

The wording of the question did not ask for all the absolute extrema, but rather the 
maximum and minimum value. The maximum value of f subject to the constraint is 18 
and the minimum value of f subject to the constraint is —V18. 


303 


In some cases it is not optimal to use Lagrange multipliers, and solving for one variable 
and plugging it into the equation for another is sufficient and possibly easier, but we 


sometimes make errors when we do so. For example, in the preceding example, we could have 


12 — 2y? — x? 12 — 2y? — x? 
solved for z? = a and plugged into f to get f(x,y, z) =axy+ eee 
h(x, y) and tried for find the extrema of the resulting two variable function. We could then 
have solved to find local extrema using the techniques of the preceding section. Solving for 


critical points we would have had: 
2 
i=o- gt 0. 


fy =2- wy = 0. 

This system’s only solution is at (0,0) and we could have solved for z = +2, but 
the corresponding points would not have been the absolute extrema. In fact, h(x,y) = 
12 — 2y? — x? 

3 
y = 0 as x becomes large, whereas x cannot become any larger than V12 along the original 
constraint. When substituting a solution of a constraint into an equation is it important 
to make sure that no information is lost. It is true that if 2? + 2y? + 32? = 12 then 


ry + has no minimum value because it becomes arbitrarily negative along 


but the constraint x? + 2y? + 3z? = 12 tells us more than 
2 


f(z,y,2z) = xy + 
2-—2,* —« 


that. Simply determining where local extrema of h(x, y) = ry4 might occur 


did not tell us where the local extrema of f subject to the constraint were because knowing 
12 — 2y? — 2? 
that f(z,y,z) = ry ¥ ~*" does not tell us that f(z,y,z) = xy + 27, where 


3 
9  12—2y*- 2? ee 
Zo = ——_1—__, In other words, the substitution step was not reversible. We further 


observe that at (V6, V3, 0) the variable z is not locally a function of x and y but rather y 
is locally a function of x and z and = is locally a function of y and z, so the methods of the 
preceding section are not applicable for determining a local extremum if z is used as the 
dependent variable. This is much like the case in single variable calculus where we have to 
be wary of the boundary points of an interval and test the end points separately. In this 
case, the ellipse on the boundary of the domain over which z is a differentiable function of x 
and y in the constraint surface is a place where the extrema might not show up at a critical 
point. Thus, we warn the reader that while it is often quicker to substitute information 
from a constraint into an equation that unless we are careful when we do so we may lose 
information that will cause us to fail to notice one of the absolute extrema. Be careful that 
the extrema are known to be points that would appear where z is a differentiable function 
of x and y in the constraint or that the points where this is not true are checked separately 
if you do such substitutions. 


We sometimes want to add a second constraint for a three variable function and so on. 
We can add as many contraints as we wish if the number of constraints is smaller than 
the number of variables, and the pertinent partial Jacobian is non-zero where the extrema 
occur. For simplicity we will set g and h to zero instead of an arbitrary constant k (this 
does not reduce the generality of the theorem since by subtracting & from both sides of a 


304 CHAPTER 11. DIFFERENTIATION IN HIGHER DIMENSIONS 


g(x,y, z) = k constraint we get a G(x, y, z) = g(x,y, z) —k = 0 constraint), and we will just 
do the proof for the two constraints for a three variable function case. The argument can 
be extended to any number of constraints which is smaller than the number of variables, 
but the general argument looks a bit messy and it is easy to get lost in the subscripts, so 
we are not going to include it here. 


Theorem 11.30. Lagrange Multipliers with two constraints. Let f :U + R,g:R®>R 
and h : R® - R be C! functions, where U is open in R® and p = (xo, yo,20) € U and 
(p, f(p)) ts a local extremum for f subject to the constraints g(x,y, z) = 0 and h(a, y, z) = 0, 


where gx(p)hy(p) — hy(p)gx(p) # 9. 
Then there are numbers X and ys so that V f(p) = AVg(p) + uVA(p). 


Proof. By the Implicit Function Theorem, for some € > 0 it is the case that for all z € B.(p) 
satisfying both constraints it is true that 2 = x(z) and y = y(z) for some C' functions 
x(z), y(z). 

Treating x and y as functions of z, we can consider f(x(z), y(z), z) as a function of one 
variable and since this function takes on an extremum at z = zg we know f(2x(z), y(z), z)'(z0) = 


0. Thus, we can use the chain rule to write (1) f.(p) + fe(p) 5 (20) + fl) 32 (20) =). 
Oy 


Differentiating the constraint functions with respect to z we obtain (2) g, + ua, + 
O Oy ) 
Ge 5 = 0 and (3) hz + hy "Bs + hy a =(). This is true at every point, and in particular at 
z z 


By Cramer’s Rule, since gz(p)hy(p) — hy(p)g2(p) 4 0, we can find unique A, j1 so that 
the equations (4) f.(p) + \go(P) + he(p) = 0 and (5) fy(p) + Agy(P) + uhy(p) = 0 are 
both true. 

Adding X times equation (2) plus yw times equation (3) to equation (1) at point p 


yields (f.(p) + Agz(p) + whz(p)) + (fe(P) + Age(P) + phe(p)) 5° (20) + (fy(p) + Agy(P) + 


phy(p)) 4 (20) = 0. Thus, by (4) and (5) it must follow that this simplifies to (6) f.(p) + 


Agz(p) + uhz(p) = 0. 
Having established (4), (5) and (6), the result follows. 


Example 11.5. Find the absolute extrema of f(x,y,z) = xyz subject to the constraints 
g(z,y,2) = 22 +y? =8 and A(z,y,z) =e2+y+z2=0. 


Solution. The intersection of the two constraint surfaces is a regular simple closed curve 
which is closed and bounded, so we know that there will be a maximum and a minimum at 
a point where Vf = AVg+pVA and the constraints are satisfied. This gives us the following 
equations: 


305 


r+y+tz=0 

If x,y or z are zero then f is zero, which cannot be the minimum or maximum of f on 
the curve of intersection of the two constraints since we can see that we can get positive and 
negative values for f on the constraint curve simply by looking at the graph and seeing that 
there are points where two coordinates are positive and one is negative or two are negative 
and one is positive. This means that we can divide by any variable without dividing by 
Zero. 

We note that yw = xy by the third equation. Plugging this into the first two equations and 
Yo BY. Lee 

Qr si 

x*z—a7y. Using the last equation we have z = —a—y, so y?(—x—y)—ay? = x?(—a—y)—2y, 
so y*(2x + y) = x?(2y + x) and (x? + y”)(2y + # — 2x — y) = 0. Since 2? + y? = 8 from 
the fourth equation we have that 8(y — x) = 0 which tells us that x = y. From the fourth 
equation this gives us that either r = 2 = y so z = —4, or © = —2 = y and z = 4. 
Testing these values we get f(2,2,—4) = —16, so (2,2, —4, —16) is the absolute minimum 
and f(—2,—2,4) = 16, so (—2,—2,4,16) is the absolute maximum of f subject to the two 
constraints. 


solving for » gives us that A = Multiplying by 2ry gives y?z— xy? = 


There are other methods for finding an extremum subject to a smooth constraint curve, 
such as parametrizing the curve. In the next example, we contrast both methods for a 
particular function. 


Example 11.6. Find the absolute extrema of f(x,y) = 2x + 3y subject to the constraint 
Dj 32 
a+y =1. 


Solution. Here the constraint curve is just the unit circle. Proceeding with Lagrange 
multipliers in the usual way we have the system: 


P= X22) 
3 = A(2y) 
a? +4? =1 
1 3 
Hence, we know that x,y 4 0 and solving for \ gives us AX = — = 5° 8° 2y = 3a and 
e 2y 
3 9 
y = x. Substituting this into the third equation gives us 77+ ral =1,sox27 = 13 and x = 
2 2 3 2 
+——. If «x = —= then y = —=, and if « = ——— then y = ———. Since rationalized 


VI13 v13 y13 


denominators are traditionally considered more simplified, we would say that the points 


2/13 3/13 mes —3/13 


at which extrema may occur are are “ao and : i3 ). Plugging into f 
v 3V13 4/13 9V13 ee —2V13 -—3V13 
h = | = 13— =Vv13 ; = 
i dE a Teas ond 3) 
2V13 ii 
—vV13. Thus, f has an absolute maximum (Fa ar Vv 13) and an absolute minimum 
—2V13 -—3V13 
( v1). 


a a 
Next, we use the other method to find the extrema of the function subject to the 
constraint which we mentioned. We parametrize the unit circle as r(t) =< cos(t),sin(t) > 


over [0,27] then we see that on the constraint curve we have F(t) = f(r(t)) = 2cos(t) + 


306 CHAPTER 11. DIFFERENTIATION IN HIGHER DIMENSIONS 


3sin(t), which has derivative F’(t) = 3cos(t) — 2sin(t). We then use the usual method for 
finding absolute extrema of a differentiable function of one variable on a closed interval. We 
check F' at the end points of the parametrization (0 and 27) and at the values of t where 
3 
F'(t) = 0. We see F'(0) = F(27) = 2. Setting 3 cos(t)—2sin(t) = 0 we have that tan(t) = Si 
3 3 
so t = tan-/(—) ort = 7+ tan !(=). Drawing a triangle we see that if t = tan“(5) then 


we have the following triangle sides from the Pythagorean Theorem. 


3 2 
From this we see that sin(t) = —== and cos(t) = —=. In the case where t = 7 + 


‘ V13 V13 
tan 1(=), the sine and cosine are negated, so plugging into F' we get the same values as 


before for the absolute extrema. 


In this case (and many others) the method of Lagrange multipliers is probably a little 
easier, but there are instances where thinking of the right parametrization is probable 
better than the usual Lagrange multiplier process. Usually, if in doubt, it is best to assume 
Lagrange multipliers will probably be the easier process and look for a parmatrization only 
after finding that the generated equation system does not seem tractable. 

When we are trying to find absolute extrema over a piecewise smooth curve, we look 
at each piece separately, and the hypotheses for Lagrange multipliers usually only work 
at points other than the end points of the piece decomposition, so the ends are tested 
separately. If parametrizations are used, we likewise parametrize each piece and use the 
ends of the piece decomposition intervals as end points to test as well. 

Here is an example with a square constraint. Each side of the square is part of the graph 
of a differentiable function, but the corners are not points where the curve is differentiable. 


Example 11.7. Find the absolute extrema of f(x,y) = 2x + 3y subject to the constraint 
curve consisting of the square whose sides are contained in x = +1 andy =+1. 


Solution. Here the constraint curve is a square. We have not satisfied the criteria for the 
constraint curve at the ends of the sides of the square, but if an extremum occurs inside the 
sides of the square other than at the corners, it should show up with Lagrange multipliers. 
This means we will have to check the corners of the square. At other points, we have either 


307 


x = +1 or y= +1. Thus, depending on which side of the square we are in, we could have 
a system looking like one of these: 

2= A(1) or 2 = A(0) 

3 = X(0) or 3 = X(1) 

x=ctlory=+1 


This system has no solutions so there are no extrema on the constraint except at the 
corners (where the Lagrange multipliers theorem is not applicable). Since the Extreme 
Value Theorem guarantees that f does have absolute extrema over the closed and bounded 
square we just test the corners to identify the absolute extrema. At (1,1) we get f(1,1) =5 
is the absolute maximum value, and at (—1,—1) we get f(—1,—1) = —5 is the absolute 
minimum value. 


Now, in this example the location of the absolute extrema was fairly clear from the 
outset. There are quite a few problems like that, but in many cases (probably most) it is 
hard to simply see the value that will lead to an extremum. 


Not all constrained domains over which we may wish to find an extremum are graphs of 
a level curve or surface of regular curve. For example, we might want to find the extrema 
of a function over the closed set bounded by an ellipse in the plane (or an ellipsoid in three 
dimensional Euclidean space). In such cases we find critical points in the interior of the set 
in question since an absolute extremum on the interior of a domain must also be a local 
extremum, and then use Lagrange multipliers (or parametrizations) to find the absolute 
extrema on the constraint. We compare the values of the function at the critical points in 
the interior and the absolute extrema on the boundary to find the absolute extrema over 
the set. 


Example 11.8. Find the absolute extrema of f(x,y) = 2? + y*?+ ay” over D = {(z,y) € 
R2|2? + y? < 1}. 


Solution. We need both local extrema and extrema on the boundary curve. So, we take 
fe = 22 +y? =0 
y = 2y t+ 2ry = 0 

Thus 2x = —y’, so 2y—y* = 0 and so y(2—y”) = 0 which means that y = 0 or y = +V2. 
If y = 0 then x = 0 and if y = £V2 then x = —1. 

Only one of these critical points is inside the unit disk D, so we include (0,0) in our set 
of points to be tested. 

We then use Lagrange multipliers to find the absolute extrema on the boundary of D, 
which is the constraint circle x? + y? = 1. We could parametrize the circle, but most likely 
the Lagrange multiplier process will be more efficient. This gives us: 

Qn + y? = d(2x) 

2y + 2ry = X(2y) 

a? + y" =1 


308 CHAPTER 11. DIFFERENTIATION IN HIGHER DIMENSIONS 


If « = 0 then y = +1. If y = 0 then x = +1. Otherwise we can divide by 2a and 2y 
2 


to give A= 1+ 5 


= =1+42, so 2a” = y? and therefore 32? = 1 from the third equation, so 
x 


2 
x = t—. Thus, y = 1/2 . Checking each of these possibilities we see that f(0,0) = 0, 


V3 v6 V3 v6 1 
0,+1)=1 +1,0) =1 —,+—)=1+ — and ae wil . 
f(0,1) = 1, F(4A,0) = 1, FOP EYE) = 1+ Ze and (AF 2S) = 1 Ze. Hence, 
the absolute minimum is (0,0,0) and the absolute maxima are ( = 2 14 : 


It is natural to ask ”should we test points that are local extrema of the function to 
be optimized which occur on the boundary constraint rather than on the interior of the 
domain over which we are looking for extrema?” The answer is that it does no harm to do 
so but it is unnecessary since those points would show up as potential extrema subject to 
the boundary constraint if they were points at which an extremum could occur. 


309 


Exercises: 


Exercise 11.1. Let V be a convex open subset of R” and let f : V > R™ be Ct. Prove 
that if |Df(a)| is bounded then f is uniformly continuous. 


Exercise 11.2. Let I be a non-empty open interval and let f: I > R be differentiable on 
I. If f(L) ts contained in the boundary of some open ball B,(0) about the origin, then prove 
that f(t) and f'(t) are orthogonal for allt € I. 


Exercise 11.3. Let B = {(2,y) € R?|x* + y® < 1} and let f : B > R® be aC" one to 
one function so that Ap #0 on B. Let M be a connected subset of f(B). Then f—'(M) is 
connected. 


Exercise 11.4. Prove that there are functions u(x, y),v(a,y),w(a,y), and anr > 0 such 
that u,v,w are continuously differentiable and satisfy the equations 


wt+aev—ytw=3 
v+yw—xct+tw=3 


wty—at=4 


on: B11) 3-30 that will) = 1, ol 1) = 4, end wal) 


Exercise 11.5. Give the second order Taylor polynomial based at the origin in the direction 
<h,k >, and write the formula for the remainder for f(x,y) =e". 


Exercise 11.6. Let f(u,v) = (uv,u? + v) be defined on the portion of the first quadrant 
of the uv-plane with v > u. Find the derivative of f~ at the point (3,10). 


Exercise 11.7. Find the absolute extrema of f(x,y, z) = xy over the compact solid bounded 
by the ellipsoid x? + y? + 27 = 18. 


Exercise 11.8. Let f : R? > R° andg : R® > R* be defined by f(x,y) = (a”y,r+2y, 3y+z) 
and g(u,v,w) = (v?,3u + v,2w +3v,u+w). Use the Chain Rule to find the derivative of 
gof at (1,2). 


310 CHAPTER 11. DIFFERENTIATION IN HIGHER DIMENSIONS 


Exercise 11.9. Let f(x,y,z,w) : R* > R be C?. Prove that frz = fzx. More generally, if 
g:R" >R and xj, 2; are variables of the domain then Gx jx; = Ga;2x;- 


Exercise 11.10. Let g : R" > R be a C"*? function. Then for any finite sequence of 
integers 11, 12,...,%% € {1,2,...,n}, and any permutation (one to one correspondence) P of 
the order of these finitely many integers to give a re-ordering P(ii, P(i2),...,P(ix), it is 


always true that GYerix Big..Pi, = Yep(iy)®P lin) BP(i,)* 


Exercise 11.11. Find the tangent plane to the surface z* = 2z+2x+5xy+y? at the point 
(15.1.4). 


Exercise 11.12. Taylor’s series in R”. 
Let f: U > R be aC™ function, where U is open in R” and L(x,2+h) CU. Then 


f(a+h) = f(a) +> eC h). 
i=1 


Exercise 11.13. True or false (assume sets are in R”, and give a brief justification for 
your answers): 

(a) The continuous image of a closed set is always closed. 

(b) Every differentiable function is continuous. 

(c) Every function whose partial derivatives exist at every point is continuous. 

(d) Every function whose partial derivatives exist at every point is differentiable. 

(ec) Every function whose partial derivatives are continuous at every point of the domain 
of the function is differentiable. 

(f) The continuous image of a closed bounded set is always closed. 

(g) The boundary of a set is always closed. 

(h) A function from R” to R™ is continuous if and only if each of its component functions 
is continuous. 

(i) A function is differentiable if and only if each of its component functions is differentiable. 


(j) Let f :R? > R. If the lim f(ax,bx) = 0 for all<a,b>€ R? then lim f(a,y)= 
«2-0 (x,y) (0,0) 
0 


(k) Let f : R? > R. [If the first partial derivatives of f exist and are continuous 
everywhere and the second partial derivatives exist at the point (a,b) then fry(a,b) = 
yx (a, b). 

(l) The continuous image of a connected set is always closed. 

(m) The inverse image of a connected set under a continuous function is always connected. 

(n) If the graph of a real valued function defined on a closed interval is closed and 
connected then it is also compact. 

(0) The union of two compact sets is always compact. 


311 


(p) The intersection of two connected sets is always connected. 

(q) A function f : E > R™, where E C R” is continuous if and only if for every open 
set U C R” the set f—-1(U) is open in E. 

(r) A function f : E > R™, where E C R” is continuous if and only if for every closed 
set A CR” the set f—'(A) is closed in E. 


(s) A function f : E > R™, where E C R” is continuous if and only if for every compact 
set K CR” the set f~'(K) is compact. 


312 CHAPTER 11. DIFFERENTIATION IN HIGHER DIMENSIONS 


Solutions: 


Solution to Exercise 11.1. Let V be a convex open subset of R” and let f : V > R™ be 
C'. Prove that if |Df(a)| is bounded then f is Lipshcitz (and thus uniformly continuous) 
on V. 


Proof. Choose M so that |Df(x)| < M on V. Let x,y € V. Since V is convex, we know 
that L(x,y) C V. By the Mean Value Theorem for Vector Valued Functions we can find 
c € L(x,y) so that (f(x) — f(y))- Df(c)(x — y) = |f(x) — fly) 2. Thus, |f(x) — f)I? < 
If) — FYIIDF(c)||x — yl, so |f(x) — fly)| < M|x — y|, so f is Lipschitz (and uniformly 
continuous) on V. 


Solution to Exercise 11.2. Let I be a non-empty open interval and let f : I > R” be 
differentiable on I. If f(I) is contained in the boundary of some open ball B,(0) about the 
origin, then prove that f(t) and f'(t) are orthogonal for allt € I. 


Proof. Since f(t). f(t) = r? is constant, we can use the dot product rule to get that 
2f'(t)- f(t) =0, so f’(t) is perpendicular to f(t). 


Solution to Exercise 11.3. Let B = {(2,y) € R?|a? + y* < 1} and let f : B > R? be a 
C! one to one function so that A; #0 on B. Let M be a connected subset of f(B). Then 
f-1(M) is connected. 


Proof. By Theorem 11.24, f isa homeomorphism and f~! is continuous. Since the continuous 
image of a connected set is connects, f~'(/) is connected. 


Solution to Exercise 11.4. Prove that there are functions u(x, y), v(x, y), w(x, y), and an 
r >0 such that u,v,w are continuously differentiable and satisfy the equations 


w+av?—-yt+w=3 
v+yu—2+w=3 


w+y—zt=4 


on: Be 4, 1)2-$0 that a, V1. 001) = 1, and ws) 2 


Proof. Let F(u,v,w,2,y) = (w+a2v?—y+w-3,v?+yu?-—2+w—3,w?+y°—2*—4). Note 


O(f, Fo, F: 
that F(1,1,2,1,1) = (0,0,0) and that F is C! on all of R°. Also note that OF, Fe, Fs) = 
O(u, v, w) 
5ut Qaeu 1 59 2 1 
O(F,, Fo, F 
Quy 5vt 1 jso A FF) 1,2,1,1) = 25 1/=8440 
0 0 Qw 050) 0 Or 


313 


By The Implicit Function Theorem, there is an r > 0 and a unique C function g : 
B,(0,0) + R° so that F(g(x,y), x,y) = (0,0,0) for all (x,y) € B,(0,0), where g(x,y) = 
(u(x, y), v(x, y), w(x, y)) and thus u, v, w are Ct on B,(0,0). Since F(g(x, y), x,y) = (0,0,0), 
it follows that u°? + av? —y+w =3, v + yu? —2+wu =3, and w?+y° — 2* = 4 for all 
(x,y) € B,(0,0). 


Solution to Exercise 11.5. Give the second order Taylor polynomial based at the origin 
in the direction <h,k >, and write the formula for the remainder for f(x,y) = e*". 


Solution. The first and second order derivatives are fz = ye’, fy = xe’, frx = yet, 
tyy = xe"Y and fry = fyx = e*% + xye*’. The only first or second partial derivatives 
that are non-zero at (0,0) are f,y(0,0) = fy2(0,0) = 1. Thus, the second order Taylor 


polynomial is f(h,k) = 1+0(h) +0(k) +0(h?) + 0(k?) + 1(hk) + (kh) = 14+ = =1+hk. 


The third order derivatives are frr, = y®e™, n= xen, fen] ye" ay 6? = Jas 


1 
and fyye = 2xe"Y + 2*ye™ = fyey. Thus, the remainder is q° (kee? + heh + (4e2 4 


2c103)h7k + (4c, + 2coc?)hk?) for some c, € (0, h), cz € (0, k). 


Solution to Exercise 11.6. Let f(u,v) = (uv,u* + v) be defined on the portion of the 
first quadrant of the uv-plane with v > u. Find the derivative of f~' at the point (3, 10). 


Solution. We have Df(u,v) = |: o 


is non-zero and the function is one to one with a non-zero determinant, by the Inverse 
Function Theorem the function f~! is C1 and the derivative of f at (3,10) is the inverse of 


Df at the inverse of (3,10), which is (1,3), where Df(1,3) = A 5; 


| Since the determinant of the derivative matrix 


i The inverse of this 


=) 


3 | which is the derivative of f~! at (3,10). 


detach ol 6 
matrix is — 
16 |-1 


Solution to Exercise 11.7. Find the absolute extrema of f(x,y, z) = xyz over the compact 
solid E bounded by the ellipsoid x? + y? + z* = 12. 


Solution. Since E is compact, there is a maximum and a minimum value of f on E. The 
extrema could occur at the boundary or on the interior of a solid. To check the interior 
we set f, = yz = 0 and fy = xz = 0, fz = xy = 0. Thus, the points where all partial 
derivatives are zero are those where two of the variables are zero. However, if any variable 
is zero then the function’s value is zero, which is not the maximum or minimum value of 
the function on EF. 


314 CHAPTER 11. DIFFERENTIATION IN HIGHER DIMENSIONS 


Using Lagrange multipliers then, if an extremum occurs on the boundary, f, = yz = 
A(2x), fy = vz = A(2y), and f, = xy = A(2z). Since the extrema do not occur when a 
LZ £Y ye 
Qy 22 Ax 
y” = 22 =4. Checking the points where this occurs, we have f(2,2,2) = 8, f(2,—2,—2) = 
8, f(—2,2,-2) = 8, f(—2,—-2,2) = 8, giving the absolute maxima, and f(2,2,—2) = 
f (2, —2,2) = f(—2,2,2) = f(—2, —2, -2) = —8, giving the absolute minima. 


variable is zero for this function, we can divide to get A = Hence, x? = 


Solution to Exercise 11.8. Let f : R? > R® and g : R® > R* be defined by f(x,y) 
(a?y, 2 + 2y,3y +2) and g(u,v,w) = (v?,3u + v,2w + 3v,u+w). Use the Chain Rule to 
find the derivative of go f at (1,2). 


Solution. By the Chain Rule, of D(go f)(1,2) = Dg(f(1,2))Df(1,2). We know Dg 


: x ; 2xy ag? 
03 2 and Df = : and that f(1,2) = (2,5,7). Hence, D(go f)(1,2) = 
1 0 1 
omy pe 
pats.2nBs2)=]) 4 9] |! 2t=|5 af 
1 0 1 5 4 


Solution to Exercise 11.9. Let f(x,y,z,w) : R’ > R be C?. Prove that frz = fex. More 
generally, if g : IR" > R is C? and r;,v; are variables of the domain then xix; = Gxjxi- 


Proof. Applying Clairaut’s Theorem, if y and w are held fixed then f(z, y, z, w) is a function 
of two variables, namely x and z, so fr: = fzx by Clairaut’s Theorem. 

In general, if g : R" — R and 2;,2; are variables of the domain then the partial 
derivatives with respect to these variables are found treating the other variables as fixed 
values. Thus means that g can be considered a two variable function for purposes of partial 


derivatives with respect to just the variables 7;,7j, 80 gzjc; = Yx;2;- 


Solution to Exercise 11.10. Let g : R” > R be a C”*? function. Then for any 

finite sequence of integers i, %2,...,i% € {1,2,...,n}, and any permutation (one to one 
correspondence) P of the order of these finitely many integers to give a re-ordering P(i;, P(iz2),..., P(tx), 
it is always true that Gov, Rig ...iy, = GP) €Plig) “BP lig)" 


Proof. Let p; = P(i,). By Exercise 11.9, we have Gaetan — Sey tiger ty i) 
which means that x4, Bin Bp, 1p, °° Lin = Gai, Bin.-Cp,—2Bp, Lp, -1° Lig: Hence, we can switch 
any adjacent two variables in order. By switching adjacent pairs until xp, is in the first 
position we see that Ge p(y) Pig) PP(,) = Gia €Plig)*EP(ix)-L Pix) Then by repeatedly 


315 


switching adjacent pairs (we can move 2;, into the second order derivative position and so 


on until we get that Gari, Big..0iy, = Jep(iy)®P lin) BP(i,)* 


Solution to Exercise 11.11. Find the tangent plane to the surface z* = 2z+2xa+5xry+y’ 
at the point (1,1,4). 


Solution. Setting F = 2z+2¢+5ay+y?—z?, we have F, = 2+5y, Fy = 52+2y, F, = 2-22, 
so VF(1,1,4) =< 7,7,-6 >. Tangent plane is 7(x — 1) + 7(y — 1) — 6(z — 4) =0. 


Solution to Exercise 11.12. Taylor’s series in R”. Let f : U + R be a C™ function, 


where U is open in R"” and L(x,x+h) c U, and jim D+) f(e,h) = 


oo (k + 1)! ecb (ovnn) 
Then f(a+h) = f(x) + +P 


Proof. By Taylor’s Theorem for multivariable functions, for every k € N we know f(x + 


ieee 1 
x) + S- gD te, h) + D+ f(c,h) for some point c € L(x,x +h). 
i—1 . 


(k+1)! 
1 
Since jim, nce = 0, it follows that dim f(x + h) — f(x) - 
eed as 
es f(x, h) =0, so f(x+h) = eo ae 


i=1 


Solution to Exercise 11.13. True or false (assume sets are in R", and give a brief 
justification for your answers): 

(a) The continuous image of a closed set is always closed. 

(b) Every differentiable function is continuous. 

(c) Every function whose partial derivatives exist at every point is continuous. 

(d) Every function whose partial derivatives exist at every point is differentiable. 

(ec) Every function whose partial derivatives are continuous at every point of the domain 
of the function is differentiable. 

(f) The continuous image of a closed bounded set is always closed. 

(g) The boundary of a set is always closed. 

(h) A function from R” to R™ is continuous if and only if each of its component functions 
iS continuous. 

(i) A function is differentiable if and only if each of its component functions is differentiable. 


(j) Let f :R? > R. If the lim f(azx,bx) = 0 for all<a,b>€R? then lim f(x,y) = 
«z—-0 (x,y) (0,0) 


(ke) er f 2 R? > R. [If the first partial derivatives of f exist and are continuous 
everywhere and the second partial derivatives exist at the point (a,b) then fry(a,b) = 


fyx(a, 6). 


316 CHAPTER 11. DIFFERENTIATION IN HIGHER DIMENSIONS 


(l) The continuous image of a connected set is always closed. 

(m) The inverse image of a connected set under a continuous function is always connected. 

(n) If the graph of a real valued function defined on a closed interval is closed and 
connected then it is also compact. 

(0) The union of two compact sets is always compact. 

(p) The intersection of two connected sets is always connected. 

(q) A function f : E > R™, where E C R” is continuous if and only if for every open 
set U C R™ the set f—'(U) is open in E. 

(r) A function f : E > R™, where E C R” is continuous if and only if for every closed 
set A C R” the set f(A) is closed in E. 

(s) A function f : E > R™, where E C R” is continuous if and only if for every compact 
set K CR” the set f-!(K) is compact. 


Solution. (a) False. The continuous image of a compact set is always compact and therefore 
1 

closed but, we could, for instance take f(z) = — on (0,00), which takes [1,00) (which is 
x 


closed) to (0, 1] (which is not). 
(b) True by Theorem 11.3. 
Qxr7y 


(c) False. A counterexample would be f(z, y) = 


The limit does not exist at zero (see part (j)). 

(d) False. Same counterexample as (c). 

(e) True by Theorem 11.7. 

(f) True (assuming the set is closed and bounded in R”), because a closed and bounded 
set in R” is compact by the Heine-Borel theorem, and the continuous image of a compact 
set is compact by Theorem 10.33, which implies that it is closed. 

(g) True. For a set E, we know that 0(£) = E \ E®°. Since EF is closed and E° is open, 
we know that E \ E° = EM (R” \ E°) is an intersection of closed sets and is closed. 

(h) True by Theorem 10.17. 

(i) True. A function f : V — R”, where V is open in R” is differentiable at a 
If(x +h) — fx) — Df(x)hl 

[hl 


point x if and only if jim = 0, which is true if and only if 
> 


i fi(x +h) — filx) — Dfi(x)-h 
im 

h-—0O \h| 

Df (x) by Theorem 10.17, which is true if and only if f; is differentiable for each component 
FOL F 


(j) False. A counterexample would be f(x,y) = 


= 0 for each i, where Df;(x) represents the ith row of 


xy 
any line would have a limit of zero, but approaching along y = x 
(k) False. Unless it is known that at least one of fry or fyz is continuous at the point 
in question, Clairaut’s Theorem fails. 
(1) False. The continuous image of a connected set is connected but not necessarily 
closed (just take the identity map f(x) = x, and note that f((0,1)) = (0,1) is not closed). 
(m) False. Let f(a) = x”. Then f~'([1,4]) = [—2,—1] U [1, 2], which is not connected. 
(n) True. If the graph G of a function f : [a,b] > R is closed and connected we know 
that then the function is continuous by Theorem 10.48, which means that F(a) = (a, f(x)) 
is also continuous on [a,b] by Theorem 10.17. Hence F'({a, b]) = G is compact by Theorem 
10.33. 


. Approaching the origin along 


? would give a limit of 1. 


317 


(0) True. The union of two closed sets is closed, and the union of two bounded sets is 
bounded, so the union of two closed and bounded sets is closed and bounded. In R” this 
means that the union is compact by the Heine-Borel Theorem. 

(p) False. The graphs of y = x and y = 2— 2” are connected, but their intersection is 
the disconnected set consisting of the points (—1,1) and (1,1). 

(q) True by Theorem 10.30. 

(r) True because the inverse of complement of a set is the complement of the inverse, so 
this follows from Theorem 10.30. 

(s) False. Using f(a) = 1 is continuous on R and the single point set {1} is compact, 
but f~1({1}) =R, which is not compact. 


Chapter 12 


Integration in Higher Dimensions 


When integrating a function over a domain higher dimensional Euclidean spaces, we have 
to address the question of which functions can be integrated and the question of which 
domains may be integrated over, though this can be reduced to considering which functions 
may be integrated over n-rectangles since integrating over a subset of such a rectangle can 
simply be thought of as integrating a function which is re-defined to be zero outside of that 
subset. Thus, we begin with integrating over a rectangle in R”. 


We begin with a lot of definitions and background theorems. A reader might get bogged 
down in this (necessarily extensive) foundation that we require to prove the main theorems 
we want to use about integration and unify ideas discussed earlier. For instance, we will 
have notions about volume based on integration over a region and notions based on outer 
and inner sums and we need to demonstrate that these are the same or our concept of 
volume is inconsistent. 


Definition 86 


If g is a real valued function whose domain includes {1, 2,3,...,4} for some k € N 
k 


then we use the notation [[9@ = g(1)9(2)...g(k). If S1,So,...,S5,% are sets then we 
i=1 
k 


use [+ to denote the Cartesian product 5S; x S2 x ... x S, whose elements are 


i=1 
k-tuples (s1, 52, $3,..., 8%) where each s; € S; for 1 <i<k. 
We define an n-rectangle (or just a rectangle) to be the Cartesian product of n 
n 


closed intervals R = ] [les bl. We say the n-volume (or just the volume) of R is 
i=1 


a) = [2 —a;. We may refer to |a;,b;| as the ith edge factor of R. An n-cube or 
| 


cube in R” is a rectangle whose edge factors all have equal length. We will refer to 
n 


Cx) = [[@: ~ aie + > as the e-cube centered at x = (21, £2, 3, ..., np). 


ll 


318 


319 


A rectangle in R? is the normal idea of a rectangle plus the region enclosed by that 
rectangle (also called a rectangular disk). A rectangle in R is just a line segment, and a 
rectangle in R® is a box plus the region enclosed by the box. Similarly, a cube in R? is 
a square plus the region enclosed by the square. When integrating a function of multiple 
variables we multiply function values by the volumes of rectangles rather than the length 
of line segments in the domain as we did for single variable functions. 


Definition 87 


n 
Let f : R > R, where R = ] [lei bd is a rectangle in R”. For partitions P, = 
i=1 
{ao ao. 0} of [ax, bg] for 1 < k < efer to P= {Pi, Pa, Py, Pa} 
ee eee x, ox] for 1 < n, we refer to {Py Po ee boas 


a partition of R, and we refer to G = G(P = Tet ce a Die er EO Sey ira stole 


each 1 < j <n} as the grid on R (or over R) Pee by P. The mesh |G| of grid G 
is the largest diameter of any element of G. 
Let partitions Q1,Q2,...,Qn be partitions of [a1, bi], [a2, b2],...,[@n, bn] which 


induce grid H on R = [ [les bul. Let partitions P,, P2,...,P, be partitions of 


i=1 
[a1, bi], (a2, ba], ...,[an, bn] which induce grid G on R. If partitions Qi, Q2,...,Qn 
are refinements of partitions P,, Po, P3,...,P, respectively, then we say that H is 
a refinement of G and that partition Q = {Q1, Q2, Q3,...,Qn} is a refinement of 
partition P = {P,, Po, P3,...,P,}. We will use the notation G * H to denote the 
refinement of G and H induced by partition P*Q = {P,UQ1, P2UQg,..., Ph UQn}. 


For each R; € G we let M; = sup f(x) and m = inf f(x). We define the upper 
xER, xe fut 


sum of f with respect to grid G to be U(f,G) = y M;|R:|, and the lower sum of 
RtrEG 
f with respect to grid G to be L(f,G) = s mez| RI. 
REG 


Here, the idea of a grid takes the place of the notion of a partition of a closed interval 
in one dimension. 


Definition 88 


We refer to the upper integral of f to be w) | j= inf U(f,G) (where G is 
R 
understood to range over all possible grids on R). We refer to the lower integral of 


f to be (L ) fF = sp L6G) G). If ( a) | r= 0 ) | 7 =F then we say that f is 


320 CHAPTER 12. INTEGRATION IN HIGHER DIMENSIONS 


integrable and the integral of f on R (or over R) is | ie be 
R 
ht I, = {a\") os) Aner af)" is a marking of P; for each 1 < k < n then we will 


reler taf = 7 P) = IT as a marking of the grid G. We may shorten notation 


by saying that if R; = Tle a8 ae ] € G then x; = =a Cer ae ey )eET. We 


L 


say that Sr(f,G) = > in x;)|R;| is the Riemann sum of f over G with respect to 
R:EG 


marking T. 


Note that, for each R; € G, it is always true that if x,y € R; then |x — y| < |G|. 


We tend to use iterated integral signs to refer to the integral of an integral, but iterated 
integrals signs without bounds are also used simply to declare the dimension of the domain 


to be integrated over. Thus, if R is a two dimensional rectangle we may write [ f= 


Fle If fdA = a f(x,y)dA. All of these mean the same thing, and while the 


eq ae is not a rauied convention, using the letter A is intended a suggest to the reader 
that the grid rectangles have a two dimensional volume (an area). Likewise, if R is a 3- 


rectangle then we use a variety of equivalent notations [ f= y / i. 1 — / | a fdV = 


/ / | f(x,y, z)dV, where the letter V is intended to make the reader think of of the grid 
R 


rectangles as having a three (or higher) dimensional volume. 


In dimension two these notions are easiest to picture. A rectangle R = [a1, b1] x [a2, ba], 
which is the set of all points (x, y) € R? so that ay < x < by and ag < y < by. Readers are 
encouraged to take a few minutes thinking about what each of these definitions would look 
like for a two dimensional rectangle (integrals of positive functions over these rectangles are 
volumes, so it is reasonable to see the definitions visually). 


Our first theorem addresses ideas of containment and convexity. Each ball contains a 
cube, and each cube contains a ball about a point. Rectangles are convex, and so are balls. 
We can use open cubes as a basis for a topology on R” instead of open balls when it is 
convenient. 


Theorem 12.1. Some basic properties of rectangles and distance: 
(a) Let p = (pi, p2, D3, ---, Pn) € R” and let € > 0. Then the e-cube C.(p) has diameter 
evn, and C.(p) C : BeynlP) and Be(p) © Cr(p). 


(b) Let R= I a;,b;] be an n-rectangle in R", and let a = (a1, 42,43,...,@n) and b= 


(b1, ba, b3,..., a). Thee the diameter of R is |a— b|. Furthermore, R is convez. 


321 


(c) A set U C R” is open if and only if for every p € U there is and € > 0 so that 
C.(p) CU. 


€ E € € 
Proof. (a) Si oe eee tS, po +e, Pat =)| = 
roof. (a) Since |(pi 57P2— 57 Pn 5) (p1 Pat 5.9 Pat 5) 


we know that diam(C.(p)) = diam(C.(p)) > €\/n. For any x,y € C.(p) we know that 


|x; — yi| < € for each 1 < 2 < n, which means that |x — y| = SoG — yj)? < e/n and 
i=1 
therefore diam(C.(p)) = en. 


We note that for any x € C.(p), for each 1 < i < n, we know |a; — pil < 


Jn 


Ix— pl < Y* so C(p) © Beya(P)- 

For any x € B.(p) we know that |x; —p;| < |x—p| < € for each 1 < i < n, which means 
that x € Co(p). 

(b) We know that the diameter of R is at least |a— b| since a,b € R. For any x = 
(11, £2,23,--;Ln),¥ = (Y1,Y2, 435-5 Yn) € R we also know that 2;,y; € [ai,bi] for each 


€ 
2? sO 


n 


Se — yi)? < |a-bI, 


i=1 


1<i<n. Therefore, |x;—y;| < |a;—b;| which means that |x—y| = 


so diam(R) = |a — b] as desired. 

To see that R is convex, let x,y € R. Then L(x, y) = {(1—t)x+ty|0 < t < 1}. For each 
1<i<nandeach0<t< 1 we note that if 2; < y; then (l—t)a; tty: = aj+t(y—xi) > xi 
since y; — x; > 0 and also (1—t)a; + ty, < (1 —t)y, + ty; = y; since 1—t > 0. Similarly, if 
x; > y; then (1 —t)x; + ty; € [yi, xi]. Since a; < x; and a; < y;, and x; < bj and y; < b; for 
each 1 <i <n, it follows that (1 —t)x; + ty; € [a;, bj] for each 1 <i < nandO0<t<1land 
therefore L(x,y) C R. Thus, R is convex. 

(c) Let U be open and let p € U. Choose € > 0 so that B.(p) C U. Then by part (a) 
we know that Ce CBAp) CU: 

Let U be a set so that every point in U is contained in an epsilon cube centered at that 
point and contained in U. Let p € U and choose € > 0 so that C.(p) C R. Then by part 
(a) we know that B«(p) C C.(p) € R, so U is open. 


Our next theorem addresses the usual ordering of sums for a grid, that lower sums are 
no more than Riemann sums, which are no more and upper sums over any particular grid. 


Theorem 12.2. Let f: RR be bounded, where R is a rectangle in R". Let G be a grid 
over R. Let T be a marking of G. Then L(f,G) < Sr(f,G) < U(f,G). 


Proof. For each R; € G we know that m: < f(x) < Mz, from which it follows that 


So milRil < So f(x) Rel < SS Mil Ril, 


RrEG RrEG RiEG 


322 CHAPTER 12. INTEGRATION IN HIGHER DIMENSIONS 


As in one dimension, we have to establish that refining a grid will make upper sums 
decrease (or stay the same) and lower sums increase (or stay the same), which is the objective 
of the next theorem. 


n 
Theorem 12.3. Let f: R—R be bounded, where R = ] [le bd is an n-rectangle in R”. 


i=l 
Let P = {P1, Po, P3,...,Pn}, where Py = {2 2, ..., 2} is a ene of [ax, bp] for 


1 < k <n. Let Q = 101 :Q9;.O3, 2: Ont; where Qk = (qh, .. On (Ky is @ partition of 
(ax, bk] for 1 < k <n, and let G = G(P) and H = H(Q) be the grids induced by these 
partitions, where H is a refinement of G. Then L(f,G) < L(f, H) < U(f,H) < U(f,G). 


Proof. We first prove the result is true when Q; = P; U {q}, where q € (2), 2 9), and 
P, = Q; for i 4 j. Let S be the set of all elements “of grid G whose jth edge factor is 
re O]. Let Re = (2? 4,00?) x (2? 0? ] x. x fc), 2] x... x [2 ,, 2] © 9. Then 


—1"% 1? Lio 


R, = RE ORY ) where RO) — bow. ae) x [2?) ae (ee () pax... [x (™) ag ” ") € H 


44-19% 24 12 pt 12 Ti & in 


and ROY) = fa? ,, 2) x (22,02?) x.  [g, 2] x «x [2 on w\”) € H. 
If we let Mi _ = sup f(x) and mW) = inf f(x), and we let Me _ sup f(x) and 
RY) RY? Rw) 


mY) = inf f(x), then by Theorem 1.17 we know that M; > max{MM, MY} and mz < 
RU 


min{m| Hy) ‘(eh (where M; = sup f(x) and m = inf f(x). 
Rt t 
Thus, it follows that U(f,G) — U(f,H) = S> MR] — Mf? |RY 
Ries 
do Mi(|Rel — [Ry] — [Ry”|) = 0, 80 U(f,G) > Uf, H). Likewise, L(f,@) — L(f, H) = 
Ries 
So ral Bel = mp] = my RL) SY ma Rel = [RY | — [By |) = 0, 80 Lf, G) < 
Res Res 
Uy, 2). 

Next, let P; = Q; for i A j and let Q; = P; U{q1, q2,...,¢m}. Then let H; be the grid 
induced by P,, Po,...,Pj-1, Pj} U{q1, 2, +5 G}, Pj4i,--, Pr} for all 1 <i <_m. Then we see 
Uf Hgts SUG): 

Finally, let Q, = P,U Ds for some finite set D, contained in [a,,bs] for each 1 << s <n. 
We define grid Ky, for each positive integer 1 < w < n, to be the grid induced by P, U D, 
1<s<wand P, sa w<s<n. aes the Sepak se aare we pee that ce Ee < 
a me U(f, ki) < UCf, a as oe 


L U U 
= MPR | > 


Paralleling the development in single variable calculus, we next establish that lower 
sums never exceed upper sums, even if they are upper and lower sums over different grids. 


323 


Theorem 12.4. Let f: RR, where R= [[le:. b;] 7s an n-rectangle in R”, be bounded. 
i=1 
Let G and H be grids on R. Then L(f,G) < U(f,H). Furthermore, (Z) f eS w) | ie 
R R 


Proof. If P;,Q; are partitions on [a;,b;] for 1 < i < n so G = G(P,, Py, P3,..., Py) and 
H = H(Qi1, Q2, Q3,..-; Qn), then G * H is a grid which refines both G and H. By Theorem 
12.3 it follows that L(f,G) < L(f,G*« H) <U(f,G* H) < U(f,#A). 

Thus, for any grid H on R, it is the case that U(f, H) > L(f,G) for all grids G on R, 


so U(f,H) => > (Df f, which makes (LZ | f a lower bound for all upper sums U(f, #1), 


R 
which means that ( cL) f fel w) f f. 


As with single variable integrals, we can next establish that a function is integrable if 
and only if the upper and lower sums can be made arbitrarily close to one another, which 
is the next theorem. 


Theorem 12.5. Let f : R > R be bounded, where R is a rectangle in R”. Then f is 

integrable if and only if for every € > 0 there is a gridG on R so that U(f,G)—L(f,G) < « 

Proof. First, assume that f is integrable. By the approximation property, we can find 

grids G, H so that U(f,G) < wy f ft+eand L(f,H) > w) | f —«, which means that 
R R 


f—€< LH) < LUG *H) SULG*H) <UnG) < | fre Thus, it follows 
R 


R 
that U(f,G* H)-—L(f,Gx* H) <e. 
Next, assume that for every € > 0 there is a grid G on R so that U(f,G) — L(f,G) < « 
Let € > 0. Choose G so that U(f,G) — L(f,G) < «. But then we know that L(f,G) < 


if rs w) | F< U(f,G), so0 < ofr-afs <e. Since this is true for all 


€ > 0 it follows that (Z) f = w) | f, so f is integrable. 
R R 


We next show that continuous functions on a rectangle are integrable, and that if a grid 
mesh is sufficiently small then upper and lower sums can be made arbitrarily close to one 
another. 


Theorem 12.6. Let f : RR be continuous, where R is a rectangle in R". Then f is 
integrable. Furthermore, for any ¢ > 0 we can find a number 6 > 0 so that if G is any grid 
over R with |G| < 6 then U(f,G) — L(f,G) <e 


Proof. Since f is continuous on the closed and bounded rectangle R, from Theorem 10.35 
we know that f is uniformly continuous. Let « > 0. Choose 6 > 0 so that if |x—y| < 6 then 


| f(x) —fly)| < RT Let G be a grid over R with |G| < 6. By the Extreme Value Theorem 


324 CHAPTER 12. INTEGRATION IN HIGHER DIMENSIONS 


there are points p,,q, € R, for each rectangle R; € G so that f(q,) < f(x) < f(p,) for 
€ 


each x € Ry. Hence, U(f,@)-L(f,G) = YS (f(p,)—f(az))|Rel < Rl \ > [Ril = €, which 
REG RiEG 


means that f is integrable. 


We would like to integrate over regions that are not rectangles. To do this, we just extend 
the function to be defined on a rectangle by defining the function to be zero elsewhere, but it 
would be nice to know which regions continuous functions can be restricted to and continue 
to be integrable over those regions. The ideas of Lebesgue measure zero and Jordan content 
will help us to understand which regions can be integrated over and which functions are 
integrable. 


Definition 89 


Let S CR”. We say that S has Lebesgue measure zero, written A(S) = 0, if, for 
every € > 0 there is a countable collection of rectangles {Ri}ien (the collection can 


Co 

also be finite) which covers S so that ye |R;| <«. We say that S has Jordan content 
=i 

zero or volume zero, denoted Vol(S') = 0, if, for every € > 0 there is a finite collection 


m 
of rectangles {.Rj}1<i<m which covers S so that SS |Ri| <e. 


i=l 
If FE C R, an n-rectangle, and G = {R:}1<t< x is a grid on R then we define O(E, G) 
to denote {R; € GIR; NE FO}, the set of outer rectangles for E, I(E,G) to denote 
{R; € G|R; C E°}, the set of inner rectangles, and S(E,G) = {R; ¢ G|R; NE FO} 
the set if intersecting rectangles for E with respect to grid G. 
We define V(E, G) = |R:| to be the outer sum of E with respect to the 
RrEO(E,G) 
grid G, and v(£,G) = SS |Rz| to be the inner sum of EF with respect to the 
R:€I(E,G) 
grid G. We define the outer volume of E to be (O)Vol(E) = inf V(E,G) and the 
inner volume of E to be (I)Vol(F£) = sup v(E,G). 


G 
Sets S and T are non-overlapping if Vol(S MT) = 0. If it is false that S and T 
are non-overlapping then we say that S and T overlap. 


Note that in the above definition a finite union of rectangles is always closed, which 
means that any collection of rectangles covering E also covers LE. Hence, if we let D be the 
set of all finite sets of rectangles which cover E. Then (O)Vol(£) = inf Se |R| (because 

REC 
D is also the collection of all finite sets of rectangles that cover E). Thus, (O)Vol(E) = 0 
if and only if Vol(E) = 0. 

Also, in the definition of rectangle above, we have only defined a set to be an n-rectangle 
if it is the product of edge factors in the axes. However, by re-orienting the axes we can 
extend this definition to a product relative to any orthogonal set of coordinate directions. 
The terms “inner rectangles” and ‘outer rectangles” will be useful for us, but they are not 


325 


standard terms used in other texts, and the choice $(E,G) is supposed to make the reader 
think of rectangles “sharing” points with EF. 

In terms of these sets, U(f,G) = be M;|R;| and L(f,G) = Ss" m;|R;\. 

R;€S(E,G) R;€S(E,G) 

At this point we also have two notions of volume for a rectangle (product of side lengths 
or infimum of sums of volumes of rectangles in finite covers of the rectangle). One of the 
things we will have to resolve is to show that these two definitions are equivalent. 

First, it is helpful to notice that while there are some topological advantages to an 
open or closed cover by rectangles or interiors of rectangles, these ideas could be used 
interchangeably for establishing volume or measure zero, which is the objective of the next 
three theorems. 


We first demonstrate that we can fatten a rectangle slightly to create a rectangle whose 
interior contains the first rectangle, increasing the volume by an arbitrarily small amount. 
In some situations it can be advantageous if the larger rectangle has rational side length 
(because such a rectangle can be subdivided into smaller cubes of equal side length). 


Theorem 12.7. Let R = Ill a;,b;] be a rectangle in R” and let € > 0. Then there is a 


=I, 
n 


number | so that if0<d;<land0<cG <I for eachl<i<n thenQ= ] [le - 6:8 + il 
i=l 

is a rectangle so that R C Q° and Vol(Q) — Vol(R) < «. If R is a cube then we can choose 

c, and d; so that Q is also a cube centered at the same point as R. We can also choose the 

c; and d; so that each a; — c; and each b; + d; is rational. 


Proof. First, note that g : R” — R defined by g(21, 22, ...,Un) = 11%273...%y is a product of 
continuous functions and is therefore a continuous function, and in particular g is continuous 
at (b—a), where b = (01, ba, b3,...,bn) and a = (a1, @2, a3, ...,@n). Hence, we can find 6 > 0 
so that if |x — (b—a)| < 6 then |g(x) — g(a)| < ¢. In particular, if we set ¢ = (b; — a1 + 


by—ao4 by —An+ ) then |b—a—c| = 


) 6 ) 
DJ n nO igs 2/n 
- 5 
Setting Q = | | [a ——] we have that Vol(Q) = g(c) and Vol(R) = g(b—a), so 
Ms ae h tage 


Vol(Q)—Vol(R) < «. Notice that if we replace Q by I['« [a; — di, bs + di] where 0 < dj < iF 


then each edge factor is no larger than before, so Vol(R ) < Vol(Q) < Vol(R) + €. Also, if 


Ris a cube and each d; = c; = d < | then the new rectangle Q = [ [le —d,b; +d] is a cube 
i=1 
(since each side length was increased by the same amount). 
Let m = min{dj, dg,...,d,}. For any x = (21,22, %3,...,2n) € R we note that By,(x) C 


n 


[ [i - ™, 2: + m) CO so REO”. 


=1 

Finally, if we choose q;, a rational number between b; and 6; +/ then we can set dj; = 
qi — 6; < 1 so that 6; +d; = q;, a rational number. Likewise, we can choose c; so that a — c; 
is rational if we wish. 


326 CHAPTER 12. INTEGRATION IN HIGHER DIMENSIONS 


Since we can fatten rectangles and cubes slightly, we can also increase the volumes of 
covers consisting of such rectangles and cubes slightly, placing rectangles in the original 
cover into the interiors of rectangles in the new cover. 


Theorem 12.8. Let E C R” and let e > 0. If there is a countable collection of rectangles 
[o-e) 


{Ri}ien which covers E so that S |R;| < € then there is a countable collection of rectangles 


1 


[o-e) 
{Qifien so that R; C Q? for eachi Ee N and Ss" |Q;| < e. 
i=1 
Likewise, if there is a finite collection of rectangles {Ri}i<i<p~ which covers E so that 
k 


SS |Ri| < e€ then there is a finite collection of rectangles {Qi}i<i<k so that R; C Q? for 


i=1 


k 
each i € {1,2,3,...,k} and Ss" Qi] <e. 


i=1 
(oe) 
Proof. Let {Ri}ien be a cover of F so that Ss” |R;| = 7 < «. By Theorem 12.7, we can 
i=1 


choose Q; for each 7 € N so that R; C Q; and Vol(Q;) — Vol(R;) < aan) Hence, {Q$ bien 


9Ji+1 
covers E and S~|Qi| = alse. aaa Sa Oe 
i=1 i=1 i=1 


— 
The finite case is similar. Let {Rj}i<i<z be a finite cover of E by n-rectangles with 
k 
S- |R;| =7 <. By Theorem 12.7, we can choose n-rectangles {Qi }i<j<; so that Ry C Q? 
i=1 


k 
aay 
and Vol(Q;) — Vol(R;) < oF” As before, oS |Q;| <e. 
i=1 


In the following theorem we notice that it would make no difference whether we used 
rectangles or open rectangles (interiors of rectangles) in the definitions of measure or volume. 


Theorem 12.9. Let S C R”. 
(a) A(S) = 0 if and only if, for every « > 0, there is a countable collection of rectangle 


[oe) 
interiors {R}}ien which covers S so that Ss" |Ri| <e. 
i=1 


(b) Vol(S) = 0 if, for every « > 0, there is a finite collection of rectangle interiors 
m 


{Rj }i<i<m which covers S so that Ss" |Ri| <e. 
i=1 


327 


Proof. If, for every € > 0, there is a countable collection of rectangle interiors { R? }icn which 
co 


covers S so that > |R;| < ¢ then {R;}ien covers S, so A(S) = 0. Likewise, if, for every 
i=1 
€ > 0 there is a finite collection of rectangle interiors {Rj }i<i<m which covers S so that 


m 
> |R;| < €, then {Ri}1<i<m covers S, so Vol(S) = 0. 
i=1 
Conversely, assume that {Ri}ien is a countable cover of S by rectangles R; so that 
[oe 


> |R;| < «. By Theorem 12.8, we know that there is a countable collection of rectangles 
i=1 _ 
{Qitien so that Rj C QP for each i € N and S> |\Qi| <e. 


i=1 
k 


If {Ri}i<i<x is a finite cover of S by rectangles R; so that > |R;| < € then, again, 


i=1 
by Theorem 12.8, we can find {Qi}i<i<, so that Ri C Q? for each i € {1,2,3,...,k} and 


k 
S-1Qil <e. 
i=l 


Definition 90 


Let E Cc R” be bounded. If Vol(O(£)) = 0 then we say that FE’ is a Jordan region. 


If F is a Jordan region then we define the volume or Jordan content of E to be 
Vol(E) = (O)Vol(E). 


Note that in the above definition a finite union of rectangles is always closed, which 
means that any collection of rectangles covering E also covers E. Hence, if we let D be the 
set of all finite sets of rectangles which cover E. Then (O)Vol(F£) = inf > |R| (because 

REC 
D is also the collection of all finite sets of rectangles that cover EF). 

We will show later that a bounded set F is a Jordan region if and only if (O)Vol(E) = 
(1)Vol(£), which means that we could have used inner or outer volume in this definition of 
volume, and volume is defined if and only if it is the same as both inner and outer volume. 

It may be instructive to address why we care about Jordan regions for a moment. A 
Jordan region is a region on which a characteristic function (a function whose value is one 
on the Jordan region and zero elsewhere) is integrable. Jordan regions are thus regions on 
which all functions which are otherwise integrable on a rectangle containing those Jordan 
regions are always integrable on the Jordan region (meaning that the characteristic function 
on the Jordan region times the original function is still integrable on the rectangle). The 
Jordan regions, therefore, are the nice domains over which it is reasonable to restrict an 
integrable function’s domain and still talk about the integral of the function over that 
domain. This will be developed more formally in the theorems that follow, but first we 
have to address ideas related to volume to formalize notions that most likely seem intuitive 
to us already. 


328 CHAPTER 12. INTEGRATION IN HIGHER DIMENSIONS 


While volume zero always implies measure zero, the converse is false (consider the 
rational numbers in the real line, for instance). However, for a compact set (like a rectangle) 
the two ideas are equivalent, as shown below. 


Theorem 12.10. Let E C R”. If Vol(E) = 0 then X(E) = 0. If E is compact and 
\(E) = 0 then Vol(E) = 0. A set W is a Jordan region if and only if W is bounded and 
A(O(W)) = 0. 


Proof. Let « > 0. If Vol(£) = 0 then we can find a finite cover of E by rectangles, the 
sum of whose volumes is less than ¢. Since this finite set is also countable, it follows that 
AE) = 0. 
Let E be compact with \(£) = 0 and let « > 0. Then by Theorem 12.9, there is a 
[oe 


collection of rectangles {R;};cen so that {RP}ien covers FE and ) |R;| < ¢. Since EF is 
i=1 
k 


., Ry} which covers E, so Bal <€. 
i=1 


compact, there is a finite subcover {Rp,,Rp,,-- 


Thus, Vol(E) = 0. 

To see that a Jordan region W is bounded we note that it must be possible to cover W 
with a finite number of rectangles, the union of which must be bounded. For any bounded 
set W, since 0(W) = W \ W° is closed and bounded, it must follow that 0(W) is compact 
and therefore has volume zero if and only if it has Lebesgue measure zero. From this it 
follows that a set W is a Jordan region if and only if it is bounded and its boundary has 
measure zero. 


Next we show that a subset of a measure zero (or volume zero) set is always measure 
zero (or volume zero respectively). 


Theorem 12.11. Let AC B and let \(B) = 0. Then \(A) = 0. Likewise, if Vol(B) = 0 
then Vol(A) = 0. 


Proof. Let « > 0. Assume A(B) = 0. Then there is an open cover of B by a countable 


collection of rectangles {R,,} so that S° |Rn| < €. Since {R,} is also a cover for A it follows 
n=1 

that (A) = 0. Replacing the countable collection of rectangles by a finite collection of 

rectangles, we see by a similar argument that if Vol(B) = 0 then Vol(A) = 0. 


We next demonstrate that countable sets always have measure zero (they do not always 
have volume zero). 


Theorem 12.12. Let E = {p,, po, p3,...} be countable. Then (E) = 0. 


329 


— Then {R;} covers E 


Proof. Let R; be a rectangle containing p,; of volume less than 5 


and > |R;| < €, so A(E) = 0. 
i=1 


Next, we show that a union of countably many sets of measure zero has measure zero. 


Theorem 12.13. Let \(E;) =0 for eachie N. Then MU By 0: 


i=1 


[o-e) 
Proof. For each i € N choose rectangles {R(in)}nen which cover E; so that Ss" Never) 


n=1 


(oe) (oe) [oe) 
Then Sy IRajy| < € and {Rij }ijen is a cover for U E;, which has Lebesgue 
i=1 j=1 i=1 
measure Zero. 


€ 
Jit2° 


We next note that the closure of a Jordan region is always a Jordan region of equal 
volume, and that a set of outer volume zero is always a Jordan region (of volume zero). 


Theorem 12.14. Let E be a Jordan region inR". Then E is a Jordan region and Vol(E) = 
Vol(E). Furthermore, if (O)Vol(E) =0 then E is a Jordan region of volume zero. 


Proof. Since 0(E) = E\ E® and 0(E) = E\ E’ and E® C E’, it follows that O(E) C O(E). 
Since E is a Jordan region, Vol(0(E)) = 0, which means that Vol(0(E) = 0 by Theorem 
12.11, which means that EF is a Jordan region. 

Let D be any finite collection of rectangles that covers E. Then since ty D is closed and 


contains FE, we know U D also contains E. Likewise, any finite collection of rectangles that 


covers FE also covers E. Since the set of finite collections of rectangles covering FE and set 
of the finite collections of rectangles that cover FE are the same, Vol(E) = Vol(E). 
If (O)Vol(E£) = 0 then Vol(E) = 0, so Vol(O(E)) = 0 = Vol(E) by Theorem 12.11. 


It is a consequence of the fact that rectangles are connected that if a rectangle intersects 
a set S and is not contained in the interior of S then the rectangle intersects the boundary 
of S, as described below. 


Theorem 12.15. Let EX R#(, where R is a rectangle in R". Then R is connected, and 
RNO(E) £0 if and only if RZ E°. 


Proof. We know R is convex by Theorem 12.1 and therefore connected by Theorem 10.46. 
Assume that R Z E° and RN E #9. Suppose RN O(EL) =. Then RN E° £90. 
Let H = RO E* and let K = R\ A. Note that H and K are disjoint, non-empty and 
their union is R. Also, K contains no points of EF and thus no limit points of H since there 
are no boundary points of EF in Kk. 


330 CHAPTER 12. INTEGRATION IN HIGHER DIMENSIONS 


If p € A then there is some 6 > 0 so that B;(p) C E since p € E°, which means that 
p is not a limit point of K since K contains no points of E. Thus, H and K separate R, 
which is impossible since R is connected. We conclude that R must intersect the boundary 
of E. 

We know that if R C E° then RN O(E) = 0 by definition of boundary. 


We now show that if one Jordan region is contained in a second one then the volume of 
the superset is at least. as large as the volume of the subset. 


Theorem 12.16. Let W and E be Jordan regions and let W C E. Then Vol(W) < Vol(E). 


Proof. Let Dg be the set of finite collections of rectangles that cover F and let Vg = 

{S° |R||C € De}, the set of all sums of volumes of elements of Dg. Let Dy be the set of 
REC 

finite collections of rectangles that cover W and let Vw = {~ |R||C € De}, the set of all 


REC 
sums of volumes of elements of Dy. 


Any cover of E is also a cover of W, which means that Vg C Vw. Hence, Vol(W) = 
inf(Vw) < inf(Ve) = Vol(E). 


Next, we explain why two rectangles (defined in the usual way with edge factors in the 
axes) will always have an intersection which is another rectangle if they overlap. 


n n 


Theorem 12.17. Let R = ] [le bd and Q = [[le. ai be rectangles whose interiors 
i=l i=l 
intersect in R”. Then RNQ is a rectangle. 


Proof. By definition, RN Q = ] [fmax{ai, ci}, min{bj, di}. Since the intersection of the 


i=1 
interiors is not empty, this is a rectangle. 


We now demonstrate that rectangles are Jordan regions. 


n 
Theorem 12.18. Let R= [[la:. b;] be a rectangle in R". Then R is a Jordan region. 
i=1 


Proof. Let x = (21, %2,%3,...,%n) € R. If a; < x; < 6; for each i € {1,2,3,...,n} then let 
n 


m = min{x1 — a1,b1 — ©1, 22 — a2, bo — X2,...,Ln — An, bn — Xn}. Then By,(x) C [[@: — 
i=1 


m, x; +m) C R, which means that x € R°. 
It follows that if x € O(R) then x; = a; or 2; = b; for some i. We define rectangles 


) 
R1(6), R2(d), ..., Ron(d) by Roj-1(6) = (a1, 04] x [a2, bg] me daw [agest, bet x [a; = 9799 + 


331 


6 ) 
I x [aj41, 6541] xX... X [Qn, by] and Ro; = (a1, 64] x [a2, b3] xX... X lagisbs x [b; _ 575 + 
2n 


} 

5. x [aj41,b;41] X ... X [an, bn] for each 7 € {1,2,...,n}. Then O(E oa eee Let M = 
i=1 

n 2n 2n 2k 

II (b; — a; +1). Then ~|Ri| < S > M6. Hence, if we choose 6 < 5 —— saz then D1 <€. 


i=1 i=1 i=1 
Thus, Vol(0(R)) = 0. 


We next prove that if we add the rectangle volumes in the grid rectangles induced by a 
grid then the sum of the volumes is the volume of the original rectangle on which the grid 
was taken. 


Theorem 12.19. Let R = ] [lei 8 be a rectangle in R” and let G = G(P) be a grid 
i=1 
on A where P SAP Pees) andr; = {2 2, ..., x} for eah1<i<n. Then 


|R|= S0 |Ril- 


Ri:EG 
Proof. We will induct on the dimension. a R. Ifn =1 then R = [a1, b;], and G is just a 


partition P, = {a ea a}, and oa — gl) = Ura = Re 


xt 


k 
Next, assume that for some k € N it is true that for any grid H on II a;, bj], the sum 


i=1 
k k+1 
SS” |Ril = [[ (0: - ai). Let R = [[[ai,bi] with grid G@ = G(Py, Po, ..., Pes Peyi) on R, 
RCH i=l i 
where H = H(P\, Po,...,P,). Then G = {R, x ot), oR, € Hand1<k < ngyi}. 
NEAL NEAL 


Hence, > [Rel = D> > [Rel(a*? — aT?) = So aft? - TT - a) = 
RirEG i=1 R,CH i=1 i=1 

k+1 

[[(@: - a) =|. 

i=1 


In the next theorem, we show that the sum of the volumes of rectangles in a grid on 
one rectangle which are contained in a second rectangle is less than or equal to the volume 
of the second rectangle. 


Theorem 12.20. Let G = {Q:}1<1<% is a grid on a rectangle W in R". Let R be a rectangle 
in R”. Let C = {Qi € GQ: C R}. Then S~ |Qi| < |RI. 
QtEC 


Proof. If R and W do not overlap then C = 9 and the result follows. Assume that R 
and W overlap. Let G = G(P) be a grid on W where P = {P,, Po,...,P,} and P; = 


332 CHAPTER 12. INTEGRATION IN HIGHER DIMENSIONS 


{2,2 1, BOY for each 1 <i<n. Let R= ] [lee for some a;,}; for 1 <i<n. Then 


i=l 
A= A(P; M (a1, 63] U {a1, bi}, P» M (a2, b3] U {ao, bo}, Bre erm il lee Bei U ‘abi Once) is a grid 
on R where each element of H is contained in some element of G. Furthermore, C C H 
n 


since if Q; € G and Q; C R then it must follow that Q; = Ife’ ) zY)] for some choices 


sjy—-1l) si 
i=1 


of 1 < 5; < n;. Since Q; C R it follows that [2 a] C [a;, bj] for each i € {1,2,3,...,n}, 


s;—-l? Si 


which means that 2%) ,, 2 € P;- |a;,0;] and therefore Q; € H. Hence, for each Q; € C 


sj—-1) "8; 
we know that Q; € HNG. It follows that SS Qu] < s Oe < S> |F;| = |R| by 


QrEC Q:€HNG K,eH 
Theorem 12.19. 


We now show that the volume |R| if a rectangle is the same as the volume Vol(R) of 
the rectangle. 


k 


Theorem 12.21. Let D = {Rj, Ro,..., Ry} be a collection of rectangles so that U R? covers 
i=1 
k 
rectangle R. Then os |R;| > |R|, and Vol(R) = |RI. 
i=1 


Proof. Since R is compact, by the Lebesgue Number Lemma we can find 6 > 0 so that if S 

is a set intersecting R of diameter less than 6 then S C R? for some i € {1,2,3,...,k}. Let 

G = G(P) be a grid on R where P = {P,, Po,...,P,} and P; = (2,0, vy OY for each 
1<i<n,so that |G| < 6. For eachi € {1,2,3,...,k}, define C; = {Q; € G|Q; C R;}. Then 
n 


Ss" Ss" \Q:| > |R| since every Q; € C; for some i for every Q; € G, and Ss" \Qi| = |R| 
i=1 QtEC; QtEG 
by Theorem 12.19. 


For each j, we know SS |Q:| < |R;| by Theorem 12.20, so it follows that |R| < 
QrEC; 


n k 
S” SS 1Qi < So |Ri|. Thus, |R| < Vol(R) by Theorem 12.9. 
i=l QrEC;i i=l 
Since {R} is a cover of R, we know that Vol(R) < |R|. Hence, Vol(R) = |RI. 


We next show that if you take any rectangle R containing a Jordan region & then the 
infimum of all of the outer sums of EF’ with respect to grids on R is the same as the volume 
of the Jordan region. This lets us characterize Jordan region with grids, which is sometimes 
helpful. We observe that this is true regardless of the choice of rectangle R used, so it makes 
no difference to the infimum of the outer sums of E which rectangle containing EF is used. 


333 


Theorem 12.22. Let E be a Jordan region contained in a rectangle R in R". Then 
Vol(E) = inf V(E,G), where G ranges over all grids on R. 


Proof. By definition, for any grid G on R, the outer sum V(E,G) = - |Re|, 

{ Rp€G|R,NEZAO} 
which is a sum of volumes of the elements of a finite cover of E by rectangles, so V(E, G) > 
Vol(£) for all grids G, which means that inf V(E,G) > Vol(E). 


Let ¢ > 0. We know that FE is a Jordan region with Vol(E) = Vol(E) by Theorem 
k 


12.14. Let {R?}i<i<, be a finite cover of E by interiors of rectangles R; so that s |Re| < 
i=1 
Vol(E) +. By the Lebesgue Number Lemma we can find 6 > 0 so that if S is a set with 
diam(S) < 6 and S intersects E then S C R? for some i. Choose a grid H = {Q:}1<t<w on R 
with |H| < 6. For each: € {1,2,3,...,k} let C; = {Q; € H|Q:; C Ri}. By Theorem 12.20, we 
k 


know that S- \Q:| < | Ril, so Ss" S- |\Q:| < Vol(E)+e. Furthermore, if Q;0 E 4 0 then 
QrECi i=1 QeEC; 


k 
Q: C R? for some i, which means that V(E,H) = Ss" 1Q:| < ye S- lQil < 
{Q:€H|QiNEZ0} 1=1 QEC; 
Vol(E£) + «. Hence, inf V(E,G) < Vol(E) and therefore inf V(E, G) = Vol(E). 


Next, we show that, much as integrals exist when upper sum can be made arbitrarily 
close to lower sums, a region is a Jordan region (and volume exists) if and only if we can 
find grids where the inner and outer sums over those grids can be made as close as we wish. 


Theorem 12.23. Let E C R”. Then E is a Jordan region if and only if, for every « > 0 
there is a grid G on a rectangle R containing E so that V(E,G)—v(E,G) = V(O(E),G) <.«, 
in which case Vol(E) = supv(£, G) = (1)Vol(E) = (O)Vol(E) = inf V(E, G). 

G 


Proof. We know that E is a Jordan region if and only if O(£) = 0 which is true if and only 
if for every € > 0 there is a grid G on a rectangle R containing E so that V(E,0(£)) = 
Ss" |Rj| < «. We know that Rj O(£) ¥ 0 if and only if R; Z R° by Theorem 
{R;€G|R;NO(E)40} 
12.15. Hence, S(E,G) = O(E,G)\I(E,G). Therefore, V(E,G)-v(B,G)=  S> |Rjl- 
R;€O(E,G) 

S- |R;| = S- |R;| = V(O(E),G). Thus, F is a Jordan region if and only if 
R;€I(E,G) R;€S((E),G) 
for every « > 0 we can find a grid G so that V(O(E),G) < e¢, which is true if and only if 
V(E,G) — v(E,G) <e. 

Let E be a Jordan region and note that Vol(E) = inf V(E,G). By Theorem 12.16 
we know that v(£,G) < Vol(E) for every grid G. Let € > 0. Choose a grid G so that 
V(E,G) < Vol(E) +€ and a grid H so that V(E, H)—v(E, H) < «. Let K be a refinement 
of G and H. Then V(E,K) —e < v(E,H) < Vol(E) < V(E,K) < Vol(E) + €. Since this 
is true for all « > 0 it follows that Vol(E) = sup v(E,G). 

G 


334 CHAPTER 12. INTEGRATION IN HIGHER DIMENSIONS 


The intersection, set difference and union of Jordan regions is a Jordan region, and 
the volume of the union of non-overlapping Jordan regions is the sum of the volumes. 
Combining this with earlier theorems helps us to start to use rectangle volumes and grids 
in a way that fits with our intuition. In particular, the sum of the volumes of rectangles in 
a grid is the volume of the union of the those rectangles. This gets us to the stage where 
we have rigorously established that volume works largely in the way that we would like it 
to since these grid rectangles can then be used to approximate the volumes of every other 
Jordan region. 


Theorem 12.24. Let Fy and E2 be Jordan regions. Then Ey, U E2, Ey E2 and E, \ E2 
are Jordan regions so that Vol(E, U E2) < Vol(E,) + Vol(E2), Vol(E, N E,) < Vol(F;) 
and Vol(E; \ E2) < Vol(E,). If Ey and Ez are non-overlapping then Vol(E, U E2) = 
Vol(E;) + Vol(E2). 


Proof. Let « > 0. We can find C; = {Rj, Ro,..., Ry} and Co = {T, To, ... — ore ee of 
k 


rectangles covering EF, and FE» respectively, so that Ss |Rz| < Vol( Ey) + = 5 and s |T;| < 
i=1 
Vol(E2) + . Then C; UC is a finite collection of rectangles covering FE U E, sie sum 


of whose volumes less than Vol(F,) + Vol(F2) + «. In particular, since Vol(O(E,) = 0 
and Vol(O(E2) = 0 we can choose C; U C2 to be a collection of rectangles, the sum of 
whose volumes is less than € which means that O(£)) U O(£2) is a Jordan region with 
Vol(O(E1) U O(E2)) = 0 by Theorem 12.14. 

Since 0( £1 U Ey) C O(F\) U O(E2) and 0( £1, Ey) C O(F,) U O(E2) and O(E; \ Ez) C 
O(E£,)UO(E2), it follows from Theorem 12.11 that Vol(O(E,UE2)) = 0, Vol(O(E,NE2)) = 0, 
and Vol(0(F \ E2)) = 0, so E, U Ea, Ey FE, and FE, \ FE, are Jordan regions. 

Since FE, MN Ey C EF, and E; \ Ey C FE, we know that Vol(E,M E2) < Vol(E,) and 
Vol(E, \ Ex) < Vol(E1) by Theorem 12.11. Since Vol(E) U Ez) < Vol(£1) + Vol(E2) + € 
for all « > 0 it follows that Vol(E, U E2) < Vol(E1) + Vol(£2). 

Next, assume that £; and F2 are non-overlapping. Then we know that Vol(E,NE2) = 0. 
Note that if € FE? N LE # 0 then this intersection contains a rectangle L, which means that 
|L| < Vol(E,N E2) by Theorem 12.16, which is impossible. Hence, £1 Ey C O(£1)UO(E2), 
which has volume zero. 

Let « > 0. By Theorem 12.22, we can find a grid G; on R so that V(0(E))U0(E2), Gi) < 
e. We can also find a grid Gz on R so that V(F\ U Eo, G2) < Vol(E, U E2) + €. Let H be 
any refinement of Gy * Gz, a common refinement of G; and Gj. Then by Theorem 12.3, it 
follows that V (FE, E2,H) < Vol(F, U E2) + € and also V(O(E)) U O(E2)) < € 

Let C, = {W; E |W, C EV}, Co = {W; E A\W; C ES} and C3 = {W; E A\|Win 
(O(E,)U0(E2)) 4 0}. By Theorem 12.15, we know that each rectangle in H which intersects 
FE either intersects the boundary of EF; is is contained in the interior of E,. Hence, C; UC3 
covers £, which means that Ss" |W] + Ss" |W;| > Vol(E,). Since Ss" |Wi| < e, it 


WreECi WrEC3 WiEC3 
follows that x |W;| > Vol(E,) — e. By a similar argument, y |W,| > Vol(E2) — . 
WrEeCi WrECe2 
Since € EPNES = 9, it follows that C1NC2 = 0 and therefore V(E, H) = S> |W,| > 


{Wi€H|Win(EiUE2)40} 


335 


S> |Wil+ $5 [Wil > Vol(B1) + Vol(E2) — 26. 
WrEC1 WrEC2 
The partition H could have been chosen to refine any grid, and € could have been any 


positive number, which means that if G is a grid on R then for each € > 0 we can choose 
a grid H refining G so that V(E,H) < V(E,G) and V(E, Hf) > Vol(E,) + Vol(E2) — 2€ 
and therefore V(E,G) > Vol(£,) + Vol(£2) — 2e for every « > 0, and thus V(E,G) > 
Vol(E,) + Vol(E2). Since this is true for all grids G on R we know that Vol(E, U Ey) = 
inf V(E, G) > Vol(E,) + Vol(£2). It follows that Vol(E, U E2) = Vol(E,) + Vol(£2). 


Definition 91 


Let f : E — R be bounded, E a Jordan region. We define the characteristic 
function of a set S so be yg(x) = 1 if x € S and xg(x) = 0 otherwise. We define 
the zero extension of f to R to be the function F(x) = ye(x)f(x). We say that 
f is integrable on E if the zero extension of F' of f is integrable on R, and define 


f= | F. If E CV, another Jordan region (but f is only defined on £) then if 
E R 


R is a rectangle containing V we also define i f= i i= | F’. We also define the 


V E R 
zero boundary extension of f to E to be G(x) = xgo(x) f(x), the function which is 
f on the interior of E, but zero on the boundary of EF and outside of E. 


We next show that any bounded function on a domain of volume zero is integrable, and 
has an integral of zero. 


Theorem 12.25. Let E C R” so that Vol(E) = 0, and let f : E > R" be a bounded 
function. Then | f =0. 
E 


Proof. Let M > |f(x)| for all x € EF. Since Vol(£) = 0 there is a grid G = {Ri}1 legi<k 

on a rectangle containing E so that V(E,G) < sa by Theorem 12.22, which means 

€ € 

that| S$) M,|R\|<M >> |Mj|< 5? and | S> ml Ril <M So Im < oy 
R;€S(G) R;€O(G) R;€S(G) R;€0(G) 


Thus, U(f,G) — L(f,G) < € so f is integrable, and a < L(f,G) < , f <U(f,G) < 
E 


Since this is true for all « > 0 we know that | f =0. 
E 


It can be helpful in some instances to know that if we can cover a finite set of rectangles 
with a countable collection the sum of whose volumes is small, then the volume of the union 
of the finite set of rectangles is also small. 


336 CHAPTER 12. INTEGRATION IN HIGHER DIMENSIONS 


Theorem 12.26. Let F = {Ri}i<i<, be a finite collection of non-overlapping rectangles in 
R”, and lete > 0. Let {I?}ien be a cover of JF, where each I; is also a rectangle in R” 


and 3 |i] <<. Then 3 |Ri| <. 


i=1 i=1 


Proof. Since U F is a finite union of rectangles, each of which is compact, U F is closed and 
bounded and therefore compact by the Heine-Borel Theorem. By the Lebesgue Number 
Lemma there is a 6 > 0 so that if S is a set with diam(S) < 6 and SN (U F) #0 


then S C I? for some natural number J. Choose grids G; on R; for 1 < i < k so that 
k 


|G;| < 6 for each i € {1,2,3,...,k}, and let W = U G;. For each natural number 2, let 


i=l 

C; = {Qt € WIQ; C I;}. Observe that C; is a collection of non-overlapping rectangles 
because if Q; € Gm and Q, € G; then Q¢N Qs C RmOR,, which has volume zero since Ry, 
and R, are non-overlapping, and if Q:,Q; are contained in the same grid G,, then Q:, Qs 
are non-overlapping because elements of a grid on a rectangle do not overlap. 

Since there are only finitely many elements in W, there is a last integer 7 so that 
C; #0. Since |Q:| < 6 for all Q; € W we know that each Q; is contained in some [;, 

j 

which means that W = U C;. We know that Vol(J Ci) < J; for each i € {1,2,3,..., 7} 


i=1 
since U C; C I;. Since the elements of C; are non-overlapping, we know that Vol (U Cp): = 
k 
SQ = S> —|Q:| < |Li|. From this, it follows that S> [Qi] = So|Ri| < 
QECi {QiEeW|Qi Chi} Qtew i=l 


J lee) 
Sol < Soll <e. 
i=1 i=1 


Next we show that set has volume zero if and only we can cover the set by finite 
collections of cubes the sum of whose volumes can be made arbitrarily small. The advantage 
to being able to use cubes is that under certain nice mappings we can make the images of 
cubes behave better than arbitrary rectangles (the volume of the domain and range are 
more easily comparable, for instance). 


Theorem 12.27. Let E C R”. Then Vol(E) = 0 if and only if for every € > 0 there is a 
finite collection F = {Ci}i<i<x of cubes of equal side length whose interiors cover E, the 
sum of whose volumes is less than «. Furthermore, for every ¢ > 0 we can choose such a 
collection F so that the side length of the elements of F is less than ¢. 


Proof. If such a collection of cubes exists for every ¢ > 0 then by definition Vol(E) = 0. 


Assume Vol(£) = 0. Then by Theorems 12.14 and 12.9 we can find rectangles { Ri }i<i<t 
t k 


so that EC U R; and ys |Ri| < =: 


i=l i=1 
Let ¢ > 0. By the Lebesgue Number Lemma we can find a 6 > 0 so that if a set S' has 
diameter less than 6 and SN E #0 then S Cc R? for some i € {1,2,3,...,t}. Choose a cube 


337 


t 
Q containing U R;, and a partition P for Q consisting of equally spaced partition points in 
i=1 

each edge factor of @ inducing a grid G whose elements are cubes {Qi }i<i<s of equal side 
0) =, 

length L < ¢ so that |G| = /nL < 5 Let T = {Q; € GIO, NE FU} = {0 m5 Onases Qnet 
k 

By Theorem 12.26, we know that Ss" Qn; | < - 


= 


1 
By Theorem 12.7 we can find y € (L,¢) so that for each Q,, € T, we can find a cube 


k k 
W; of side length 7 so that Qn, C W; and |Wi| — |Qn,| < a Then FE C U Qa U WwW; 
i=1 i=1 


k k k 
€ € 
and ) (Wil = 1Qnil + > IWil - |Qnil <5 thoy = 6 
i=l i=l i=l 


Next, we show that Riemann sums can be made as close as we wish to upper and lower 
sums. 


Theorem 12.28. Let f : RR be bounded, where R is a rectangle, and let « > 0 and let 

G be a grid on R. Then there are markings T,R of G so that U(f,G) — Sr(f,G) < € and 

Sr(f, G) = L(f, G) <€. 

Proof. Since Mt = sup f(x) and m = inf f(x), we can find points rj,s; € R; so that 
weERt x t 

M: — f(s) < and f(r?) — m < rR This gives us markings T = {s;|R; € G}, 


R= {r;|R: € G} so that U(f,G) — Sr(f,G) < « and Sr(f,G) — L(f,G) <e. 


We parallel theorems in single variable integration by next showing that if a function is 
integrable then a grid with sufficiently small mesh will have upper and lower sums that are 
as close together as we wish. This is an “if and only if’ condition but the other direction 
has already been proven (we already know that if the upper and lower sums can be made 
as close as we wish then the function is integrable). 


Theorem 12.29. Let R be a rectangle in R” and let E be a Jordan region contained in R. 
Let f: E > R be integrable. Then for any € > 0 there is a 6 > 0 so that if G is a grid on 
R so that |G| < 6 then U(f,G) — L(f,G) <e. 


Proof. Since « > 0 we can find a grid G on R so that U(f,G) — L(f,G) < 5 and G = 
{Ri, Ro,..., Ry}. Since f is bounded we can find M > 0 so that |f(x)| <M for allx € E. 
Let S = U O(R;). Then the the volume of S is zero, so we can find a finite covering of 
RiEG 
€ 


S by cubes {Q1, Qa2,...;Qm} of equal side length s so that S° IQi| < "Vi By increasing 
i=1 


338 CHAPTER 12. INTEGRATION IN HIGHER DIMENSIONS 


8 slightly we obtain cubes {B 1, Bo,...,Bm} of side length t > s so that Q; C B? for each 7 
and 3 |B; rT; . Thus, the {B?, B3,...,B?,} are an open covering S. By the Lebesgue 


Number es we can find a 6 > 0 so that if T is a set of diameter less than 6 then if 
TOS 40 it follows that T Cc B? for some i. 

Let H be a grid on R with mesh |H| < 6, where H = {Tj,T,..., Tw}. For each T; € H 
and R; € G we know that exactly one of T; 1 Rj = 0, Tj C Rj or T] O(R;) # 0 
is true by Theorem 12.15. Thus, U(f,H) — L(f,H) = So (MF = mF )\(IT)N) 

{T;€H|T;NS=0} 
+ So (MP - m#)(ITjI). 
{T;€H|T;NSAO} 

For the first sum, we note that for each T; where Tj 1S = 0, we know that T; C Rj for 

some R; € G, and so Mf ~ mi < ME —m°&. Since > |T;| < |R;| it follows that 
{Tj€A|T; CR; 


k 
€ 
So (Mi = mi )(IT;) < DUM P)LRil = U(f,G) — L(f,G) < 5. 
{T;€A|T;AS=0} i=1 
For the second sum, we know that each S- |T;| = Vol( U T;) since 
{Tj €H|T;ASAO} {Tj€A|T;ASAO} 
the T; rectangles are non-overlapping. Since each 7} rectangle intersecting S is a subset of 
m m 
€ ‘ 
some B; we know that Vol( U Ls ve) Bos > [Bel x Vi From this, we 
{Tj €H|T;NS40} pa 
conclude that Ss" (Mi! — mi \UITj|) < 2M( 
{Tj€A|T;ASAO} 


€ 
aT) = 5 Hence U(f, H)—L(f, H) < 


€. 


We show that if we take any sequence of grids whose meshes go to zero (grids induced 
by standard partitions on the edge factors of a cube, for instance) then a function integrable 
if and only if, regardless of marking, the Riemann sums always converge to the same value 
(in which case that value is the integral). 


Theorem 12.30. Let E be a Jordan region contained in rectangle RC R”. Let f: ER 
be bounded. Let {G,} be a sequence of grids on R so that {|G,|} > 0. Let F' be the 
zero extension of f to R. Then f is integrable if and only if there is a number I so that 


{Sr (F,G,)} > I regardless of choice of markings T, of Gn, in which case | fH. 
E 


Proof. First, assume that f is integrable and : f =I. Let « > 0. By Theorem 12.29, we 


can find 6 > 0 so that if G is a grid on R and fel <6 then U(F,G) — L(F,G) <e 

Choose k € N so that ifn > k then |G,,| < 6. Then if n > k we know that U(F,G,) — 
L(f,Gn) < e. Since, for any marking T,, it is true that Sy, (F,Gn),I € [L(f, Gn), U(f, Gn)| 
it follows that |S7,(F,Gn) —I| < «. Thus, Sr,(F,Gn) — I regardless of choice of markings 
Try of Gy. 


339 


Next, assume that Sp,(F,Gn) — I regardless of choice of markings T,, of G,. For 
each n € N, by Theorem 12.28, we can choose markings T,,, R,, of G;, so that U(F,G,) — 


Si (Fh G.) < + wad 8s (C.)-18 G) < =. Sine (5, (GO) rand WF, G.)— 


n n 
Sr, (F,Gn)} > 0 we know {U(F,G,,)} > I. Since {Sr (F,G,)} > I and {Sr (F,Gn) — 
L(F,G,)} > 0 we know {L(F,G,,)} > I and {U(F,G,,) —L(F, Gn)} > 0. Hence, given any 
€ > 0 we can choose k so that ifn > k then U(F,G,)—L(F,G,) < € and |S7,(F,Gn)—-1| < «, 


so F' is integrable on R which means f is integrable on EF. Since | F,S7,(F,Gn) © 
R 
[L(F, Gr), U(F, G,)] if follows that | | F—Sr,(F,Gn)| < € and since |S7, (F,Gn) —I| < «, 
R 


we know that uf F| < 2e. Since this is true for all € > 0, it follows that I = | P= | pe 
R R E 


We next develop a way to determine whether a function is integrable on a Jordan region 
based on the measure of the discontinuities of the function. This takes a few steps. 


Definition 92 


Let f : E + R be bounded, FE a Jordan region. For each set S in R” which 
intersects E’ we define the oscillation of f on S to be Qf(S) = sup f(x) — 
xESNE 


inf f(x). If p € E then we define the oscillation of f at p to be wy(p) = 


inf Q;,Be(p) = lim 02,B-(p) = infQ;/(R-), where R, is an n-rectangle containing 
e>0 h—-0+ Re 


p in its interior whose diameter is €, and € > 0. 


We could have defined integrability on arbitrary regions rather than just Jordan regions 
in the manner described in the definition of integrability over a Jordan region, but for 
continuous functions that are non-zero on their boundaries the set of regions for which 
the function would have been integrable would just be exactly the Jordan regions anyway, 
and Jordan regions behave nicely under maps like those described in the Inverse Function 
Theorem (maps that are one to one, continuously differentiable and have a derivative with 
non-zero determinant). 

It sometimes looks neater to write QS rather than Q,(S), but both mean the same 
thing. For instance, on an open interval we will write Qf(a,b) instead of Q¢((a,b)). We 
should also prove that inf Q;,Be(p) = Puen Q;,Be(p) = inf Q,(R.) rather than simply claim 


that these are all the same, so we do this below. 


Theorem 12.31. f : D—R be bounded and let I, Iz be sets intersecting D with I, C Iz. 


Proof. We know sup f(x) < sup f(x) and inf f(x) > inf f(x), which means 
2EhnD xwE€IgND xeEhnD reElganD 


Of) < O52). 


340 CHAPTER 12. INTEGRATION IN HIGHER DIMENSIONS 


Theorem 12.32. Let f : D — R be bounded and let p € D. Then there is a non-negative 
number w¢(p) = inf OQ; (Be(p)) = Pee Q¢(Br(p)) = inf Q;(R.) where R, is any rectangle of 
€ —_ € 


diameter € containing p in its interior and € > 0. 

Proof. Note that Qf(S) is a non-negative real number for each set S which intersects D, 

so w= inf Q,(B,(p)) exists and is non-negative. Let € > 0. Then for some 6 > 0 we know 
y 


that Q-(Bs(p)) < w+ by the approximation property. However, we also know that if 
0<h<o then 0O¢(Bp(p)) < OQ¢(Bs(p)) by Theorem 12.31, so w = wy(p) = Jun Q;(Br(p)). 
> 


Similarly, if R has diameter less than 6 and contains p in its interior then Q/(R) < w+e 
since R C Bs(p). Likewise, if R is any rectangle containing p in its interior then there is a 6’ 
so that By (p) C R, so Q¢(Bs(p)) < Qf(R). Hence, inf ete) = inf Q,Be(p) = we(p). 


Theorem 12.33. Let f : E > R be bounded and let « > 0, where E C R”, and let p€ E° 
where wr(p) >€. Then QE > €. 


Proof. Choose 6 > 0 so that Bs(p) C E. Thus, by theorem 12.31 and the definition of 
oscillation at a point, Q7(£) > OQ,-(B5(p)) = we(p) = €. 


Continuity can be characterized in terms of oscillation. We are working towards proving 
that a function on a Jordan region is integrable if and only if its set of discontinuities has 
Lebesge measure zero. In the next theorem we show that a function is continuous at a point 
if and only if its oscillation at that point is zero. 


Theorem 12.34. Let f : D— R be bounded and let p€ D. Then f is continuous at p if 
and only if wr (p) = 0. 


Proof. Assume f is continuous at p and let « > 0. Choose 6 > 0 so that if |x — p| < 6 


and x € D then |f(x) — f(p)| < «Then sup f(x) < f(x)+ «and inf f(x) = 
: 2€Bs(p)ND 2 2eBs(p)nD 


f(x) - . Thus 0,B5(p) < € so w¢(p) < € for all € > 0, which means that w¢(p) = 0. 
Assume that wr(p) = 0 and let « > 0. Then since ws(p) = int. Q;Bp(p), by the 
€ 


Approximation Property we can find 6 > 0 so that Qf(p — 6,p + 6) < €, which means that 
if |x — ce] < 6 and z € D then |f(x) — f(c)| < , so f is continuous at c. 


We have more control over compact set behavior most of the time. While the set of 
discontinuities may not be compact for a function, we can show that the set of points in 
a compact domain where the oscillation is more than or equal to any particular value is 
compact, which is out objective below. Then we can describe the set of discontinuities of a 
function as a union of such sets. 


341 


Theorem 12.35. Let K be a closed set in R”, and f : K > R be bounded and « > 0, and 
let EK = {xe K|lws(x) >} . Then E is closed. If k is compact then E is compact. 


Proof. Let {x,} C FE, where {x,,}— x. Then for any h > 0 we know that B,(p) contains 
Xm for some m € N. Choose 6 > 0 we know that B;(xm) C B,(p). Since € < we(Xm) < 
Q;,Br(p) by Theorems 12.32 and 12.31, it follows that wy(p) > €. Hence, EF contains all of 
its limit points and is closed. If K is compact then F is also bounded and thus compact by 
the Heine-Borel Theorem. 


We next show that if the oscillation at a point is small then there must be radii so that 
open balls with radii that small and therefore diameters so that rectangles with diameters 
that small containing the point on which the oscillation is small. This is addressed below. 


Theorem 12.36. Let f : D — R be bounded and let p€ D ande > 0. If wf(p) < € them 
there is ad > 0 so that QBs(p) < «. 


Proof. This follows directly from the Approximation Property since we know that w¢(p) = 


inf O;B , 
{heR|h>0} f n(P) 


The following theorem tells us that in a compact set if the oscillation is small at points 
of the set then the oscillation on small enough rectangles intersecting the set can also be 
made small. 


Theorem 12.37. Let f : D— R be a bounded function, with K a compact subset of D. 
Let € > 0 and wy(p) < € for each p€ K. Then there is ay > 0 so that if I is a set with 
diameter no more than y and INK #0 then Ol <e. 


Proof. By theorem 12.36 for each p € K we can find an ep > 0 so that Q7(B.(p)) < «. 
Then C = {Be,(P)}pex is an open cover of K’, so by the Lebesgue Number Lemma we can 
find y > 0 so that if I is a set so that 1X K #0 and diam(J) < y then J C B.,(p) for some 
p € Kk, which means that Q,(I) < e. 


Theorem 12.38. Let E be a set in R” so that for any countable collection of rectangles 


[o-e) 

{Ri}ien covering E, oe |Ri| >. Then if {li}ien ts a collection of rectangles that covers 
i=1 

E, it is also true that by [tal oy: 
{i| DN EZ} 


Proof. Suppose that S- \I;| = a <7 for a countable collection of rectangles {J;};cn 
{i| NED} 
covering E. Then all of the points of E not covered by C = {1;|I2 N E 4 0} are contained 


in the boundaries of the J; rectangles, each of which has Jordan content zero and therefore 
(oe) 


Lebesgue measure zero. Thus, if we set B = las O(J;) then \(B) = 0 and so we can find a 
i=1 


342 CHAPTER 12. INTEGRATION IN HIGHER DIMENSIONS 


(oe) 
collection of rectangles D = {R;};en which covers B so that S> |R;i| <y—a. Hence, the 


i=1 
set of all rectangles in C'U D is a countable collection of rectangles which cover EF, the sum 
of whose volumes is less than +, a contradiction. 


The following theorem is the Lebesgue Characterization of Riemann Integrability in 
Euclidean spaces, though that is not the most common name for it. This condition 
frequently makes it easier to determine whether a function is integrable. A function on 
a Jordan region is integrable if and only if the set of points at which the function is not 
continuous is a set of Lebesgue measure zero. 


Theorem 12.39. Lebesgue Characterization of Riemann Integrability in R". Let fe: E> 
R be bounded, where E is a Jordan region in R” contained in the n-rectangle R, and let f be 
the zero extension of fz to R. Then fr is integrable on E (or equivalently, f is integrable 
on R) if and only if the set D= {x € R|f is not continuous at x} has Lebesgue measure 
zero, which is true if and only ifW = {xe E°|f is not continuous at x} = {x © E°|fr is 
not continuous at x} has Lebesgue measure zero. 


Proof. Since f is bounded we can choose M > 0 so that |f(x)| <M for all x € R. For each 


1 
n €N let Dn = {x € Rlws(x) > —}. Note that D = J Dn. If A(D) = 0 then A(Dp) = 0 
n=1 


for each n € N by Theorem 12.11. 
Assume that A(D) 4 0. Then for some m € N we know from theorem 12.13 that 
\(Dm) # 0, so there is a number y > 0 so that if {R;}ien is cover of E,, by rectangles 


then aS |R;| >. Let G be a grid on R. Then U(f,G) — L(f,G) = Ss" (M; — 
=i {RiEG|R°NDm #0} 
m,)|Ri| + S> (M; — m,)|R;|.. By Theorem 12.33 we know M; — m; > = if 
{Ri€G|R°NDm=0} mS 
R; 0 Em 4 0, so we know that ye (M; — m,)|Ri| > a by theorem 12.38, and 
m 


{Ri€G|R2ANDm £0} 
thus f is not integrable. 


[R| 


Assume that (D) = 0. Let € > 0. Choose 7 € N so that —— < =. Choose a countable 
J 


€ 
2 
(oe) 
eas 5 € 
cover of D; by interiors of rectangles C = {J? }ien so that d \Ii| < Tar Let K = R\Uc. 
1 
We know K is compact by the Heine-Borel Theorem, and w(p) < — for all p € K, so by 
J 
theorem 12.36 we can find a number 6 > 0 so that if J is a rectangle intersecting K with 
1 
diameter less than 6 then Qf(I) < =. 
J 


Let G = {Ri}icics be a grid on R with |G| < 6. Then U(f,G)—L(f,G) = Se (M;— 
{R;EG|RiNnK=0} 


343 


m,;)|Ri|+ Ss" (Mi—m,)|Ri|. Since the mesh of G is less than 6 we have » 
{R,€G|RiNK 40} {R,€G|RiNK 40} 
m,)|Ri| < IRI < £. Likewise, by Theorem 12.26, we know Ss" [Fel ae 
a a j 2° » bY ome ‘ 4M’ 


{RicG|RiNK=} 
. Hence, U(f,G) — L(f,G) < « and 


€ 


foll h M; —m,)|Ri| < 2M = 
ollows that S> ( my) |Ry| < Im 


{RiEG|RiNK=9} 

f is integrable. 

Finally, if p € R\E then there is a6 > 0 so B;(p)NE = 0 and therefore f(x) = f(p) = 0 
if x € Bs(p) NR, which means that f is continuous on R\ E, so DC (WUO(E)). Since E 
is bounded we know that O(£) is closed and bounded and therefore compact, which means 
that Vol(E) = 0 if and only if A(£) = 0 by Theorem 12.10. Since E is a Jordan region, 
we know that A(O(F)) = Vol(O(E)) = 0. Thus, if \(W) = 0 then A(W U O(E)) = 0, so 
A\(D) = 0 and f is integrable. If \(W) 4 0 then \(D) 4 0, so f is integrable. 


The following theorem helps to motivate the reason for choosing Jordan regions as the 
regions over which we will consider integrals. Specifically, a constant function can only be 
integrated over a region which is a Jordan region. 


Theorem 12.40. Let E C R, a rectangle in R". Then yp is integrable on R if and only if 
E is a Jordan region. 


Proof. Since E is bounded, 0(£) is closed and bounded and therefore compact by the Heine- 
Borel Theorem. Hence, Vol(O(£)) = 0 if and only if \(O(E)) = 0 by Theorem 12.10. Let 
p € E®. Then there is some 6 > 0 so that Bs(p) C E, so xg(x) = 1 = xx (p) and thus 
lve(x) — ve (p)| = 0 if |x — p| < 6. Hence, yg is continuous on E°. 

Likewise, if p € R\ E then there is a 6 > 0 so Bs(p)M E = 0 and therefore yz(x) = 
Xe(p) =0 if x € Bs(p) OR, which means that yz is continuous on R \ E. 

However, if p € O0(F)M R® then for every 6 > 0, Bs(p) contains a point x; € F and 
a point x2 € R\ E, which means that |x 2(x1) — xz(x2)| = 1 and therefore yg is not 
continuous at p. Thus, if EF C R° then the set of discontinuities of yg is the boundary of 
E, and whether or not E Cc R®°, we know that all points of 0(£) which are not contained 
in R° are contained in O(R), and all discontinuities of yg are contained in O(F). Hence, if 
EF is a Jordan region then A(O(E)) = 0, so the measure of the set of discontinuities of yz is 
zero, SO Xf is integrable by Theorem 12.39. If FE is not a Jordan region then \(O(E)) ¥ 0. 
Since \(O(R)) = 0, we know that A(O(F) \ O(R)) 4 0 since the union of measure zero sets 
has measure zero. Since all points of 0(£)\0(R)) are points of discontinuity of x~ we know 
that the set of discontinuities of y~ does not have Lebesgue measure zero, which means 
that yg is not integrable. 


Theorem 12.41. Let f be a continuous function which is bounded on g(F) and letg: E > 
R be integrable, where E is a Jordan region in R”. Then f og is integrable on E. 


Proof. If Dg is the set of discontinuities of g and Dfog is the set of discontinuities of fo g 
then Dfog G Dg by Theorem 10.16. Since \(Dg) = 0 we know that A(Dfog) = 0, so by 


344 CHAPTER 12. INTEGRATION IN HIGHER DIMENSIONS 


the Lebesgue Characterization of Riemann Integrability it follows that f og is integrable 
on FE. 


Theorem 12.42. Let f,g: ER be integrable, where E is a Jordan region in R”. Then 
fg is integrable. 


Proof. Let Dy be the set of discontinuities of f and let D, be the set of discontinuities of 
g and let Dyg be the set of discontinuities of fg. Then by Theorem 10.20, we know that 
Dyg GC (Ds UD,). By the Lebesgue Characterization of Riemann Integrability, A(D,) = 
A(D-) = 0. Since the union of two sets of measure zero has measure zero and any subset of 
a set of measure zero has measure zero, we know that A(Dr,) = 0, so fg is integrable. 


Theorem 12.43. Let f: E ~ R be a bounded function, where E is a Jordan region in R” 
and R is a rectangle in R” containing E. Let g :R->R be a bounded function so that if 
f(«) = g(a) for alla e E° and g(x) =0ifxe R\E. Then f is integrable on E if and only 


if g is integrable on R, in which case | f = | g. In particular, if H is the zero boundary 
E 


R 
extension of f to R. Then f is integrable on E if an only if H is integrable on R, in which 


case | p= : A. 
E R 

Proof. Let F be the zero extension of f to R. Recall that E \ E° C E \ E° = 0(£), and 

E° U(R\ E) = R\ 0(E), which means that F(x) = g(x) for all x € R\ 0(E). Since 0(E) 

is closed, if x € R \ O(£) then there is a dx > 0 so that Bs, (x)N O(E) = 0. 

If F is continuous at p € R \ O(F) then for every « > 0 there is a 6 > 0 so that if 
[x — p| < 6 and x € R then |F'(x) — F(p)| < e. Hence, if |x — p| < min{d,dp} andx ER 
then |F'(x) — F(p)| = |g(x) — g(p)| < €, so g is continuous at p. Likewise, if g is continuous 
on at p then F is continuous at p. 

Let Dr be the set of discontinuities of F’ and let D, be the set of points at which g is 
discontinuous. Then D, C Dr UO(E). Since we know that Vol(O(£)) = 0 it follows that 
\(O(E)) = 0. If f is integrable on E then F is integrabls on R and A(Dr) = 0 by the 
Lebesgue Characterization of Riemann Integrability, which means that \(Dr U O(£)) = 0, 
and so A(D,) = 0, which implies that g is integrable on R by the Lebesgue Characterization 
of Riemann Integrability. Likewise, Dr C D, U O(E), so if g is integrable on R then F is 
integrable on R and hence f is integrable on E. 

Let {G;} be a sequence of grids on R so that {|G;|} — 0. For each grid G, we can 
choose a marking T;, of Gz so that T, 9 0(E) = @. This is because FE is a Jordan region, 
which means that Vol(O(E)) = 0, which means that 0(£) cannot contain an n-rectangle. 
For each Rj € Gx it must follow that R; Z O(E) so we can choose a point t; € R; \ O(F) 
and set Tj, to be the set of points ¢; thus chosen. Since F(x) = g(x) for all x € R\ O(£), 


we know that {S7,(F,Gi)} = {Sr.(g,Gi)}. We know {Sz7,(F,Gi)} > Le = | f and 
R E 
ener _ which =a 
(Sn(a.G)}> fg whic means ff Js 


EB R 
Finally, the zero boundary extension of f to R is just a special case of a function g as 
described, so the theorem follows. 


345 


Theorem 12.44. Let f,g be integrable on a Jordan region E contained in a rectangle R in 
R™. Then: 


(a) Leta,BeER. Then f af+sa-af f+8 | o 


(b) If f(x) < g(a) for all we E® ten | t< fg 


(c) Let mE R. Then [im = m(Vol(E£)) 


Proof. (a) Let F', G be the zero boundary extensions of f,g to R. Then i Fe= iL f 
E E 


and G= i g by Theorem 12.43. Let {G,} be a sequence of grids whose meshes 
E E 
approach zero. Then by Theorem 12.30, we know that for any markings 7), of G,, it is true 


that {Sq (FG) 3 | Rand (See Gor | Go Hence AS ont pee NS 
R R 


Qa F+p | G, so, again, by Theorem 12.30, we know that i, af+6g= a | f+e | g: 
<b) By the Comparison Theorem for sequences we know ‘iat eines Sr, (F, G,) < Sr, (G (Gy) 
for each n EN, it follows that | f< i g: 
(c) Let g(x) = mifxe eer jet = 0 ifx € R\ E. Then by Theorem 12.43, 


we know that Js = [im Let G = {Ri}i<i<p be any grid on R. Then U(f,G) = 
R E 


Ss" m|R;| = mV(E,G). Taking the infimum of both sides over all grids G on R we 
R;€O(E,G) 


get mVol(E) = f g= fm. 
R E 


Theorem 12.45. Let FE, Ep be non-overlapping Jordan regions in R™ and let i f, f 
Ey E2 


exist. Then | 
E\,UE, 


po for+ft. 
Ey E2 


Proof. First, we know that f is integrable on E, U £2 because by Theorem 12.39 we know 
that the set D, of discontinuities of f on Ey, and the set Dg» of discontinuities of f on E2 
have Lebesgue measure zero. Thus, the set of discontinuities of F on EF, U E» is a subset 
of D; U Dz UO(E1) U O(E2). Since each of these sets has Lebesgue measure zero, we know 
that the set of discontinuities of F’ has Lebesgue measure zero, so i f exists. 


E\UE2 
Let R be a rectangle containing E, U Eg. Let FF), Fo represent the zero boundary 


extensions of f to R considering the domain of f to be FE, U £2, FE, and Ep» respectively. 

Since E, and Fy are non-overlapping it follows that EP? N Fy = ESO EF, = @. Let 
{G,,} be a sequence of grids on R so that {|G,|} > 0. Since O(£)) U O(E2) has volume 
zero, this set contains no n-rectangles, so we can choose markings T;,, for each G, so that 
Th (O( £1) U O(E2)) = MN: 


346 CHAPTER 12. INTEGRATION IN HIGHER DIMENSIONS 


For any t? € T,, if t? € EY then Fi(t?) = f(t?) = F(t) and Fo(t3) = 0 and if 
t; € EZ then Fi(t;) = 0 and Fo(t;) = f(t;) = F(t). If t} © Rand t] ¢ EP? U Ej then 
F\(t7) = 0 = Fo(t#), and since t} ¢ (O(£1) U O(£2)) it follows that t? ¢ (£1 U E2)°, so 
F(t;) =0. Thus, Sp, (Fi, Gn) + Sr, (Fo, Gn) = Sr, (F, Gn) since F(t]) = Fi (t7) + Fo(t;) for 
each t; € Tj. 


dines (Sy hy, Gos) Fama ee Cas | Prana s Z 
Ey E2 Ey, UE2 
it follows that | f =| f+ f Te 
E\,UR, Ey E2 


Theorem 12.46. Let E be a Jordan region in R". Then there is ad >0 so that if G is a 
grid on a rectangle containing E and |G| < 6 then V(E,G) — v(E,G) <e. 


Proof. Let f(x) =1 on E. By Theorem 12.29, we can find a 6 > 0 so that if |G| < 6 then 
V(E,G) — o(B, G) = U(f, G) — LF, G) <e. 


Definition 93 


Let R be a rectangle in R” containing a Jordan region E and let G be a grid on 
R. We define the upper inner sum of f with respect to grid G to be U(f,G)° = 

SD M;|R;| and the lower inner sum of f with respect to grid G' to be L(f,G)° = 
R;€I(E,G) 


yy m,;|R;|. We define the upper outer sum of f with respect to grid G to be 
R;€I(E,G) 
UGG) — Se M,|R;| and the lower outer sum of f with respect to grid G to 
R;EO(E,G) 
be L(,G= SY) mylRyl- 


Rj €O(E,G) 


Theorem 12.47. Let f : E > R be bounded, where E is a Jordan region in R”, and let 
€ > 0. Then there is ad > 0 so that if |G| is a grid on a rectangle R containing E then 


U(L.G)-U) f F<. and (L) | f-L(f,G) <e, and also |U(7.G)-() ff] <e and 

B B E 

In4,G)°— (E) f fl <e. 

Proof. Let W = {Wi}i<i<m be a grid on R so that u(f.w)-w) f = « and (Z) f f- 
E e. E 


L(f,W) < . Choose M so that |f(x)| < M for all x € E. Let B = | Ja(Wi). Then 
i=1 
Vol(BU O(E)) = 0. 
By Theorem 12.29 we know that there is a 6 > 0 so that if G = {Rih}i<i<, is a grid 
‘ €E E ‘ 
on R with |G| < 6 then =3 = Dy —M|Ri| < > M|Ri| < 3 (using 
R,€S(BUa(E),G) R,€S(BUa(E),G) 
functions M and —M integrated over BU O(E)). 


347 


We know that U(f,G) > ( vu) | f and L(f,G) < <(t) f f by definition. We also know 


that |U(f,G) — U(f,G)°| = | S- M|R\l< 5S) MIRil < 7 and that 
R,€S(E,G)\I(E,G) R,€S(O(E),G) 
° E 
I(F.G)-LFG|=| >) miRill<s >) MIR <5 
R,€S(E,G)\I(E,G) R,€S(0(E),G) 
We also know that if Rj C W? then MY’ SMe Sane > mi. Thus, Ss" ME|Ri| < 
RiCW?P 


M;|W;| and Se me |Ri| > m; Se |R;| for each 1 < j < m. 
RiCW? RiCW? 


m 
From this we conclude that U(f,G = S- ME|Ri|+ x ME |Ri| < S > Mj|W;\+ 
j=l Ricwe R;:€S(B,G) =i 


M|Ri| < U(f, W)+= a8 Re Hence, we see that U(f,G)—(U) | f < ae 


3 
RE S(B,G) 


and |U(f,G)° — he are eer 


3.03 
Similarly, L(f,G = S> mE|Ril+ 5S m§|Ri|. Now, L(f,W = Semi = 
j=l RicwWe Ri€S(B,G) 
S- S- mi |Ri| + Ss" ys my [Ri 1 W;|. We know that > ss mi [Ri a 
j=1 R,CW? i=1 R;€S(B,G) i=1 R,€S(B,G) 
m € ; m 
<So SO MIRNW| = S> MRI < 3: Since S> So mf|Ril > 
i=1 R,€S(B,G) Ri€S(B,G) j=1 RicWe 
d Ss" my |Ril, we conclude that L(f,G) > L(f,W 0 (L) f f-Lf,G 
j=1 RicWe 


and |L(f,G y- f fi<e 


Theorem 12.48. Let E be a Jordan region in R” and let G = {Ri}i<i<z be a grid on 

a rectangle containing E and let f be a bounded function on E’ with |f(a)| < M for all 

ze. Then |U(f,G) —-U(f,G)*| < MV(O(E),G), |U(f,G) — U(F,G)| < MV(O(E),G), 

IL(f,G) — L(f,G)| < MV (O(E),G) and |L(f, G) — L(f, G)"| < MV(O(E),G). 

Proof. By definition [U(f,G)-U(f,G)°|=| So MiIRiI- S> MiIRIlJ=| So MIR < 

R,€O(E,G) R,€I(E,G) R,€S(0(E),C) 
MV(O(E),G). Likewise, since O(E, G) \ I(E,G) C O(E,G) \ S(E, G) we know |U(f, G) — 
U(F,G)| =| a M;|Ri| — S- M,|Ri|| < MV (O(E), G). 


R:€O(E,G) Ri€S(E,G) 
Similarly, |L(f,@)—L(f,G)°| = | ye mi|Ri|— S- mi| Ril = | +S mi| Ril] < 
R:€O(E,G) Ri€I(E,G) R,€S((E),G) 
MV(O(E),G) and |Z(f,@)-L(f,@| =| S> miRi- S > milRil| < MV(O(E),G). 


R,€O(E,G) Ri€S(E,G) 


348 CHAPTER 12. INTEGRATION IN HIGHER DIMENSIONS 


Theorem 12.49. Let E C R” be a Jordan region contained in a rectangle R and let 
f:R—-R be bounded. Then the following are equivalent: 

(a) f is integrable on E 

(b) For every « > 0 there is ad > 0 so that if G is a grid on a rectangle containing E 
with |G| < 6 then all upper and lower sums, upper and lower outer sums and upper and 
lower inner sums are within a distance € of each other. 

(c) For every « > 0 there is ad > 0 so that if G is a grid on a rectangle containing E 
with |G| < 6 then one of the upper sums listed (upper, upper inner or upper outer) is within 
a distance € of one of the lower sums listed (upper, upper inner or upper outer). 

Proof. Let « > 0. Choose M > 0 so that |f(x)| <M on R. Since F is a Jordan region we 


can find a 6; > 0 so that if |G| < 6, then V(O(E),G) < "Vi 


(a) implies (b). ay Theorem 12.47 we can rae 0 <6 < 6; so that if |G] < 6, then 
€ 
wine - f fl< sand wey - f F< § and U(.G)- ff <5 and [LF,6) - 


fl < > By Theorem 12.48, we know that ae G) — U(f,G)°| < M— «< 


resilt follows from the triangle inequality. 

(b) implies (c). This is immediate. 

(c) implies (a). By Theorem 12.47, we know that we can find 6 < 6; so that the upper 
sums, upper outer sums, and upper inner sums over a grid G with |G| < 6 are within 


: € : 
distance 1 of one another, and the lower inner sums, lower outer sums and lower sums are 
eign ‘i € ‘ 
also within a distance r of one another. Hence, if any of these upper sums can be made 


within distance . of any of the lower sums then by the triangle inequality it follows that 
U(f,G) — L(f,G) < «so f is integrable. 


Iterated Integrals 


It is frequently possible to express integrals over a Jordan region or an rectangle in R” 
as an iterated integral, where one integrates integrals with respect to previous variables, 
applying the Fundamental Theorem of Calculus each time in order to end up with a neat 
way to evaluate an integral. 


Definition 94 


A set E C R? is a type one region if E = {(2,y) € R2la < x < b and g(z) < 
y < go(x)}, where gi, g2 are continuous functions of x on [a,b]. We say E C R? isa 
type two region if E = {(a,y) € R2\c < y < dand gi(y) < x < go(y)}, where 91, go 
are continuous functions of y on [c,d]. The functions g1, gz and will be referred to as 
boundary functions for the region in both definitions. 

Let D be a region in the plane that is a Jordan region. A set E C R?® is a type 


349 


one region if FE = {(x,y,z)|(z,y) € D and gi(z,y) < z < go(x,y)}, where gi, go 
are continuous functions on D, a set in the plane. We say FE is a type two region if 
E = {(z,y,z)|(y,z) € D and gi(y,z) < © < ga(y,z)}, where gi, g2 are continuous 
functions on D. We say E is a type three region if E = {(z,y,z)|(z,z) € D and 
gi(z,z) < y < go(x,z)}, where gi, g2 are continuous functions on D referred to as 
boundary functions for these regions. 

More generally, we can define a projectable region in R” to be a region EF = 
Oy) = Rac ye and gil (ay) = = gn (x) tor all (xy et 
where D is a closed Jordan region in R"~! and g, < gg and gj, g2 are continuous on 
D. We can inductively define E to be fully projectable if D is fully projectable. 


The theorems that follow justify when we can use iterated integration, which is a process 
we will then describe. 


Theorem 12.50. Let D be a closed Jordan region in R", and let f,g : D — R be continuous 
functions so that f(x) < g(x) on D. Let E be the projectable region {(a, an41) € R"*!|"% € D 
and f(x) < ni < g(z)} in R"*1. Let Gy = {(a, f(x)) © RR" |2 € D} and Gy = 
{(a,9(x)) € R"*|a2 € D} and W = {(2,2n11)|"@ € O(D) and f(x) < n41 < g(x)} then 
O(E) =G,UG2UW. Then E is a compact Jordan region and O(E) = G, UG2gUW. 


Proof. Let p € O(F) and let q = (p1,p2,p3,---;Pn) SO P = (Q,Pn41). If q € D® then 
there is an open ball B.,(q) contained in D. Suppose f(q) < Pnii < g(q). Let y = 
min{pn+41 — f(qa),9(q) — Pnyi}. Since f and g are continuous we can find 0 < 6 < €; so 


that if |x — q| < 6 in R” then |f(x) — f(q)| < Let €2 = min{6, oh Then B.,(p) C E 
because any point (x,%n+41) € B.,(p) is a point where |x — q| < 6, so f(x) < f(q) — 7 < 


In41 < g(q) + ; < g(q). This contradicts p being a boundary point of E. We conclude 


that either pry1 = f(a) or Pn41 = 9(q)- _ 

If q ¢ D then for some €3 > 0 we know that B.,(q) MD = in R”. For every (x,t) € E 
we know that x € D, so it follows that B.,(p) MD = 0 in R"*', so p ¢ O(E). Hence, if 
p ¢ Gi UG» then q€ D\ D° = O(D). Thus, 0(E) C Gi UG2 UW. 

Next, let z € Gy UGg UW, where z = (y, 2n41) for y = (21, 22, 23,---;2n) € R”. If 
z € G then z,41 = f(y) and so for any € > 0 it follows that B.(z) contains both z € EF 
and (y, f(y) — 5) g E, which means that z € O(E). Likewise, if z € Gp then zn41 = g(y) 


and so for any € > 0 it follows that B.(z) contains both z € E and (y, g(y) + 5) ¢ EF, which 


means that z € O(E). 

Let z € W\(G,UG2). Then for any € > 0 we know that B.(y) (in R”) contains a point 
s ¢ D, and hence B,(z) (in R"*') contains the point (s, 2,41) ¢ E as well as the point 
z € E, which means that z € O(E). Hence, 0(£) = Gi UG2 UW. 

Since the boundary of FE is contained in E’ we know that F is closed, so by the Heine- 
Borel Theorem £ is compact. 

To show that F is a Jordan region, we will show Vol(G,) = Vol(G2) = Vol(W) = 0. 
Let « > 0. Let R be a rectangle containing FE in R”. Since FE is compact, f and g are 
uniformly continuous on E. Choose 6 > 0 so that if x,y € E and |x —y| < 6 then 


390 CHAPTER 12. INTEGRATION IN HIGHER DIMENSIONS 


f(x) — fly)| < aR Choose a grid G = {Ri}i<i<, on R so that |G| < 6. For each R; € G 
€ 
f(%:) + sal 


so that R;N E 4 0), choose a point x; € R; and let Q; = R; x [f (xi) — amp UR] 
Then {(x, f(x)) € Gilx € Ri} C Qi. Thus, W; = {Q;|R; ON E 4 0} is a set of rectangles 


that covers G'; and Ss" lQi| < 3 |Ri i <e. Hence, Vol(G,) = 0. Similarly, replacing 
QiEW, 
f by g and G, by Gp we see that Vol(@2) = = 0. 
Since F is compact, f and g are bounded on E by the Extreme Value Theorem, so we 


pick m,M so that m < f(x) < g(x) < M for all x € D. Since O(D) has volume zero we 
t 


can find a collection of rectangles {K;}1<i<¢ covering D in R” so that S- |i] < i . 
i=l 


For each 1 <i < t, define S; = Kj; x [m,M]. Then {5;}1<i<; is a collection of rectangles 
t 


that covers W and » [Si] < (M — m) > — =e. Hence Vol(W) = 0. It follows that 


Vol(O(E)) = 0, so E is a Jordan region. 


What follows is a generalization of Fubini’s Theorem in two variables (though it is not as 
strong as what is usually thought of as Fubini’s Theorem in R”). Fubini’s original theorem 
in two variables is part (c). 


Theorem 12.51. Fubini’s Theorem. Let f : R = |a1, bi] x [a2,b2] > R be an integrable 


function on R. 
be by be 
(a) Let f(x, y)dy exist for each x € |ay, bj]. Then f f =) ( f(x, y)dy)dx. 


a2 R ay a2 
by 


by bo 
(b) Let | f(x,y)dx exist for each y € {az, bg]. Then : Ai -| ( f(a, y)dy)dzx. 
ay R a2 ay 
by 


be 
(c) Let y f(a,y)dy exist for each x € [a1,b;] and let f(x,y)dx exist for each 


ay 


bo by bi bg 
€ az, be]. Then [ = [i (Eanes | Cl Fleaanyde: 


1 al a2 


(d) Let g : Q > R be integrable, where Q = ] [lei 6 is a rectangle in R", and D = 


i=1 
n-1 a 
] [lec bl is a rectangle in R"+. For each x € ] [lec 8 = Qn-1, let g(x, t) : [an, bn] 3 R 
i=1 


bn, 
also be integrable. then f g= | | g(a, t 
Q D Jan 


(ce) Let E, be a fully projectable Jordan region contained in Q so that with respect 
to a particular ordering of the variables x1, 2%2,...,2n (after a possible re-labeling), so that 
k 


there are Jordan regions E,, Eo,...,E,—, so that E, C ] [lesb = Q, Cc R* for each 
i=1 
k € {2,3,...,2 —1} and FE, = [a1,b;], and also continuous functions fr, gn : Ex 4 R for 


351 


each k € {1,2,3,...,n —1} so that Epi, = {(a,t) ¢ R' |x € Ey and fy(a) <t < g(x) } 
for all k € {1,2,...,.2-1}. Then 


bi pgi(t1) = pg2(#1,22) Gn—1(L15+,8n—1) 
fo-| | I =) GUE 5 HDs in JOC Ay ey 
E ay / fi(x1) 2(21,02) frn—1(@15+5%n—1) 


n 


Proof. (a) Let G be the grid induced by partitions P,; = {x0,%1,...,%} of [a1,61] and 

Po = {Yo, Y1s-+) Ym} of [az, be]. Let Mi;,m;; denote the supremum and infimum of f(x, y) 

over [xj-1, 2] x [yj-1, yj], for each 1 < i < n and 1 < j < m. For a given integer 

1 € [1,n], if x} € [xj_1,2;] and y € [y;-1, y,] it is true that mi; < f(aj,y) < Mij. Then 
m 


Yj 
maj (yj — yj-1) < f(ajz,yjdy < Mij(yj — yj-1). Thus, it follows that So mij(yj 
Yj-1 j=l 
yj-1) uh (ai ,y)dy = ” He xt,y)dy < > My ;— yj-1). Taking the sum 
=1 °Yj-1 a2 
- nom n by 
over 1 < i < n gives us SD rage — ri-1)(yj — yj-1) < | f (x7, y)dy(ai — 
i=1 j=l i=1"% 
m uD) 
wi) < S$) SY) Miy(2i — 2i-1)(yj — yj-1). Setting M; = sup f(aj,y)dy and 
i=1 j=1 xi E[xi-1,2i] Jag 
n 
Mj int BA f(x;,y)dy, we see that oc tet ya) = So mi(xi — 
€[xi-1,%4] i=1 j=1 i=1 
m 
ed Se — Xi-1) < SG — Li-1) (0; — teas In other words, L(f,G) < 
i=1 i=1 j=l 
bo be 
Lf Fe 9) Pi) =U f(x;,y), Pi) < U(f,G). We can, for every € > 0, find a grid G 
a2 a2 es . 
so that U(f,G)—L(f,G) < ¢, which means that we can make Ss" M,(xi—2;-1)—-)~ m;(xj,—- 
bi phe ze 7 
Xi-1) < €, from which we conclude that i) ( f(x,y)dy)dx exists, and since it is less 
a 


a2 
than or equal to all upper sums and greater than or equal to all lower sums for f, we know 


i * foedude= ff 


a 
(b) This follows from (a) by switching the labels of the variables. 
(c) This is an immediate consequence of (a) and (b). 

(d) This is similar to the proof of (a). Let € > 0. Since g is integrable on Q we can find 
a grid G = {Q;}1<i<t on Q induced by the set of partitions P = {P, Po,..., P,}, where P; is 
a partition of [a;,b;] for each 1 < i < n, so that U(g,G) — L(g,G) < e. Let H = {Dihi<i<s 
be the grid on Q,_1 induced by Py, Po,..., Py—1. 

Let Ph = {21, 22,++,%m}. Let Mi;,m;; denote the supremum and infimum respectively 
of g(a, y) over D; x [2;-1, 2;] for each 1 <i < s and 1 <j <™m. Then for any D; € H if we 
pick x; € D, it follows that mij < g(xj,t) < Mi; if zj-1 <t < 2). 

se 


I 
Hence, it follows that mjj;(z; — 2-1) < g(x;,t)dt < Mj;(z; — 2-1). Thus, 


Zj-1 


352 CHAPTER 12. INTEGRATION IN HIGHER DIMENSIONS 


Drala Df stent | ones Ds elas 
j=l (a ea ee 


From this, we see that L(g, G = S> Sere CF oe < So |D; fos (x7, t)dt < 
Di€H j=l Di€H 


SS S> Mij|Dil (2; — 2-1) = U(g,G). 
D,€H j=1 


bn bn 
Setting M; = sup | g(x;,t)dt and m; = inf A g(x;,t)dt, the statement in the 
xi ED; Jan xf EDi Gin 
preceding paragraph gives us that L(g,G) < S- m;|Di| < S- M,|Di| < U(g,G). Since 
D;€H DicH 

bn bn 
S ni |), | = Lf g(x;,t)dt, H) and S> M,|D;| = u(f g(x;,t)dt, H), this tells us 
DieH a DicH ae 


bn, bn 
that u(f g(x;, t)dt, H) — Lf g(x;, t)dt,H) < U(g,G) — L(g,G) < ©, which means 
a Qn - 
that ie g(x, t)dt exists. Furthermore, since i) of f g(x, t)dt € [L(g,G),U(g,G)] 
DJan Q D Jan 


bn, 
we know that if g- [ / g(x, t)dt| < e. Since this is true for all « > 0, we conclude 
an 


bn 
fe Js (x :t\dt= [io 


) This follows inductively from (d) and the fact that the integral of a function over a 
J aa region and the integral of its zero extension are the same. Since we are integrating 
over £, we re-define g to be the zero extension of the original function g, extending g to Q, 


and note that f'g= f 9. 
Q E 


bn 
By (b) we have that i g= [ / g(X,%n)dxz,, where x € Qn_-1. Since the only 
Q Qn 1 


non-zero values of g(x,2,) occur for fp—1(x) < @, < gn—1(x) it follows that Js — 
E 


Gn-1 gn—1( ) 
: = g(X,@p)dxz,. Then, treating | g(X,@p)dx,y, as the integrand function, 
Qn- 1 n— 1( n—1(x) 
we apply ( (b) Fagan to ae us that 


m—1 9n—1(X,Ln—1) 
[ L= [ ie i G(X, Ln—1, Ln) dtyndxn_1, where xX € Qn—2. Since we know 
Qn 27°7aAn-1 


( ee a 


that i Gy Ef 0K), Oyo aes OX Ba iy) 0, (sinice.. att) Ee Fe) At 
In— 1( In-— on In a 
follows that [ g= [ - is (X,@n—1,Ln)dr,drp_1. Repeating this 
n— n— 1( fn 


=1(X,2n— so 
process yields the fadicated ae that 


br pgi(e1) pg2(x1,22) 9n—1(@15++;2n—1) 
| g= / | He 4. GP Wes ey RAE a 
E ay / fi (21) 2(x1,22) fn—1(#1,.-,€n—1) 


393 


This theorem doesn’t just tell us that we can switch the order of integration. It also tells 
us that we can represent integrals as iterated integrals. Integral order can be rearranged 
into any order of variables for which the Jordan region is fully projectable in the manner 
described in part (e). 

Fubini’s Theorem also lets us prove Clairaut’s Theorem (already proven) more simply, 
and since its proof did not depend on Clairaut’s Theorem there is some value to giving this 
simpler proof here. 


Theorem 12.52. Clairaut’s Theorem (again). Let f : V > R be C?, where V is an open 
set in R?. Then fey = fyx on V. 


Proof. Let (29, yo) € V. Since V is open we can choose Az, Ay small enough so that the 
rectangle R = feo: xo + Az] x [yo, yo + Ay] C V. By the Fundamental Theorem of Calculus, 


xotAx pyotAy rotAz 
/ E faylesoddude = f° faloyo + Ay) ~ falaryo)tr = Flay + A a0 + 
XC 
Ay = cs + Ax, yo) — f(x0, yo + Ay) + f(xo, yo). By Fubini’s Theorem, it follows that 
Flv0-+ Ast, yo-+ Ay) ~ flo + Ae, yo) — Fo, 40 + Ay) + F(0,00) = fh fey 


yotAy pxo+Az 
Likewise, if we were to integrate fy, we would get [ / fale, y)dedy = 


yotAy 
fy(to + Az, y) — fy(xo, y)dy = f(xo + Ax, yo + Ay) — f(ao + Ax, yo) — f (x0, yo + 


Yo 
Ay) + f (x0, yo) = i. fyc. Thus, | fry — fyx = 9. 
R R 
Suppose that | fry(xo, yo) — fyz (xo, yo)| = € > 0. Then since fry — fyz is continuous we can 


s € 
find a 6 > 0 so that if \(a, y) _ (x0, yo)| <0 then Fay fyo)(@,Y) a (fey — fyx) (Xo, Yo)| < 9 


6 
which means that | fry(%o, yo) — fyx(xo, yo)| > ss Choosing Ax, Ay < —= it must follow 


V2 
that | fry(2, y) — fyx(x, y)| > 5 and fry(z,y) — fyx(x, y) is either positive on |R| or negative 
on R, which means that iff fryla.y) —Jyel(z,y)dAl > ArAy(5), a contradiction. 

R 
It follows that fry(xo0, Yo) = fyx(Xo, Yo): 


The ideas of area and volume now have multiple definitions in terms of integrals. Recall, 
for instance, that the area between two functions was supposed to be the integral of the 


difference of these functions, / g(x) — f(x)dx. However, we also have a version of area 
a 


(or two-volume) defined by inner and outer sums which is | 1dA, which by the previous 
R 


b  pg(x) b 
theorem is i i, 1dA, which is | g(x) — f(a)dx, so the two notions of are the same. 
x) a 


While this type of observation lets us determine that integrals give volumes which agree 
with the preceding section’s definition of volume, we have to work slightly harder in order 
to get that the notions of volume are the same for regions that are not fully projectable. 
This is the objective of the theorem below. 


304 CHAPTER 12. INTEGRATION IN HIGHER DIMENSIONS 


Theorem 12.53. Let D be a closed Jordan region in R", and let f,g : D — R be continuous 
functions so that f(a) < g(a) on D. Then Vol(E) =| g—f. 
E 


Proof. Let Gi = {(x,f(x)) € R'*!|x € D} and Go = {(x,g(x)) € R"!|x € D} and 
W = {(X, @n41)|x € O(D) and f(x) < tn4i < g(x)} then O(F) = GyUG2UW. Recall that 
by Theorem 12.50, F is a compact Jordan region and O(E) = Gi UG2UW. 

Let € > 0. We first show the result is true if f(x) < g(x) on D. Let m = min g(x) — 
f(x) > 0. Since f and g are uniformly continuous on D we can find 6 > 0 so that if 
|x — y| < 6 then |g(x) — g(y)| < oI and | f(x) — f(y)| < > and therefore g(x) > f(y). 

By Theorem 12.49 we can find a grid G = {Ri}i<i<, on R on a rectangle containing D 
so that |G| < 6 and all upper and lower sums and upper and lower inner and outer sums of 


a spate. : . : 
f and g with respect to G are within 5 of the respective functions f and g. In particular, 


Ss" (mi(g) — Mi(f))| Ri] — (f g -f f) <«. The rectangles R; x [Mi(f),mi(g)] are 
R,€1(D,G) D e 
contained in the interior of FE if R; € I(D,G) because f(x) < g(x) on each R;. Hence, 
Vol(E) > So (imslg) = Mi(A))1RE. Likewise, > (ah(g) — milf) = Cf 9 

R,€I(D,G) R,€O(D,G) D 
| f) <6 and EC U R; x [mi(f), Mi(g)] because each point of F has coordinates 
p R,€O(D,G) 
other than the xz,,41 coordinate which are inside D and if (x,t) € E then x € R; for some 
R; € O(F,G) and t € [mi(f), Mi(G)]. Thus, Vol(E) < S- (M;(g) —mi(f))| Ril. Since 
R,€O(D,G) 
both Ss" (Mi(g) — mi(f))|Ri| and S- (Mi(g) — mi(f))|R;| are within a distance 
R,€O(D,G) R,€O(D,G) 


€ of ip g -| f we conclude that |Vol(E) — | g —f| < « for all « > 0 and therefore 
D D D 


Vol(E) = | eh 
D 

Next, let f(x) < g(x) on D and extend f and g to their zero extensions on R. Choose 
a6 >0so that if |x — y| < 6 then |f(x) — f(y)| < oA and |g(x) — g(y)| < aR 

Choose grid G = {Ri}i<i<z, on R with |G| < 6. 

Let W = {R; € G|m;(g) — M;(f) > 0}. For each R; € W we know that Vol((R; x R)N 
Ey = iE g — f by the preceding argument. 

Ri 
For each R; € G\ W we know that (R; x R)M E C R; x [mi(f), Mi(g)], and that 
E : 

Mi(g) — milf) < Mi(g) — mi(g) + Ma(f) — milf) < impo Mi(f) 2 mi(g). Thus, 


Vol(E\ (J Vol((Ri x R)NE))< S> |Ril-S <e. It follows that | $> Vol((R; x 


Riew R,EG\W [RI Rew 


R) E) — Vol(E)| < . 
We know | g—f = g-f+ g—-f,so | g-f= Vol((R; x R)N 


R,ew 


355 


+ fo-fHenceo< for Mi(g-/)IRiI< Sl (Mil9)— 


RicG\W RicG\W Ri x RicG\w 
mi(f))IRil< So Ril ay <e. This means that | 5 > Vol((Ri xR)NE)- f g= Fl <6 
RicG\W Rew sa 
so |Vol(E) -| g—f| < 2e. Since this is true for all € > 0 it follows that | g—f =Vol(E). 
EB EB 


Note that with a simple re-labeling of axes (or by Theorems 12.60 and 12.57, interchanging 
coordinates) if D is a closed Jordan region the coordinates of which are listed in (x1, v2, ..., ©j—1, Vj41, ++, Ln41) 
then if f and g are continuous real valued functions on D then 
E= wecre LQ, ey Lig eeey PCs LQ, 0 Vj-1, Lj41, +++ Pie) € Dand F(a5 LQ, ey Vj—-1, Lj41, +++ Liat) < 
DiS O(@1, La 90:+5 2-1, Vi H1,+-+)Ln41)} is a-Jordan region. 


Setting up a double integral: 
While a graph is not required, strictly speaking, it is a good idea to graph each domain 


over which we want to set up an iterated integral. The formula, as already described 
in the theorems above, is that if R is a type one region so that R = {(r,y) € R?|a < 


x < band gi(x) < y < go(x)} where g; and ge are continuous, then f(x,y)dA 
R 


b g(x) 
{ f(a, y)dydz. 
a Jgi(x) 


In practice, we usually think about the process of setting up an iterated integral as 
follows. First, we decide on a variable to be the “outer” variable, which is variable 
corresponding to the bounds in the leftmost integral sign. We then list that variable 
furthest to a right on the right side (so if x is the outer variable then the integral looks 


like i ie f(x, y)dydx for a type one region, whereas if y is the outer variable then the 
g(x) 


ha(y) 
integral has form a i f(x, y)dxdy for a type two region. 
c h 


1(y) 
Often, we could use either variable order assuming we are willing to subdivide the region 


into smaller regions (possibly writing the integral as a sum of integrals). The choice of which 
variable to use as the outer variable in these cases generally comes down to convenience 
and familiarity, but in some cases choosing one order of integration makes a huge difference 
in the difficulty of the problem. Sometimes finding an antiderivative in one order is simple 
but very difficult in the other. Sometimes we can express an integral as a single integral in 
one order but must break the region into many separate integrals if we write the integral 
in the other order. 

Once we have chosen an outer variable, the other variable is the (first) “inner” variable 
(though “outer” and “inner” variables are not a formally accepted terminology throughout 
texts in general). The bounds for the outer variable are always numerical. They range over 
all the values that variable can take on in points of R. In other words, the integral is taken 
over the projection of R onto that variable’s coordinates. For instance, if x is the outer 
variable for a connected region R then you would have x start at the minimum a of all x 
values so that (x,y) € R, and then the upper bound of the integral would be b, which would 


306 CHAPTER 12. INTEGRATION IN HIGHER DIMENSIONS 


be the maximum of all x-values so that (x,y) € R. 

The bounds for the inner variable are not normally numbers. They are the functions 
gi(a) and go(x). For any given z-value that means the bounds are numbers (g(a) and 
g2(x) are numbers) but these numbers vary with «x (because outside of the values between 
gi(x) and go(x), the function f is zero). Thus, you think of the bounds of the outer variable 
as from the minimum to maximum value (numbers) for that variable, and think of the 
bounds for the inner variable (say it is x for this description) as having bounds y = gi (x) 
to y = go(x) for each x in the outer variable range. 

Here is an example: 


Example 12.1. Let R be the region bounded by the triangle with vertices (0,0), (2,0) and 


(2,4) in the plane. Find | ay’dA, the volume under f(x,y) = xy” over the region R. 
R 

Solution. We will choose x for the outer variable and y for the inner variable. We can see 

that the region R is bounded by the lines y = 0, y = 2” and x = 2. This is easier to see 

from the graph below. Having determined this, the smallest value of x in the region is 0 


and the largest value is 2. Over this interval 0 < x < 2, the y values are between y = 0 
32a 


2 pQax 2 wy 
and y = 22 for each value of x. Hence, | cy"dA = | | zy*dydxz = | —| dr= 
R 0 JO 0 


3 lo 

2 Qaw4 5 2 

2 

[5 aie _ 56 
9 3 15 |o 15 


Graph of region R 


y = 22 


y=0 2 


Notice that we could have labeled the y = 2a line as x = y over 0 <y <4, the interval 
of values from the smallest to the largest values of y over the region. Note that the lower 
bound for the inner variable is and the upper bound is 2 because the integral’s bounds are 


from the smallest value of of the variable within the region to the largest one (and x = ; is 


307 


a smaller value of x than x = 2 over y € [0, 2]. Hence, had we chosen y as the outer variable 
we could have set up the integral (and gotten the same answer, as guaranteed by Fubini’s 


theorem) as follows: 
4 p2 4 ,.2,,2|2 4 4 3 5|4 
2 128 
[oars [of evacay = | cle is ay = fay Yay 2 YP 
R o J¥ o 2 |x 0 8 
1024 128 128 256 
Ae 73 Bis 15 * 


It is frequently asked whether there is an algorithm for setting up an iterated integral 
that does not rely on knowing what the graph looks like. Such algorithms tend to only work 
for certain classes of cases and tend to be more difficult than looking at a graph. Part of 
the obstacle is trying to decide which points are in the region bounded by a set of curves. 
Technically, it does seem to be possible to describe an algorithm for doing this, but we are 
not aware of any algorithms that are not excessively complicated. 

In some simpler cases, however one can see aspects of the region without a graph. If 
you can identify that a portion R of a region bounded by curves is the set of points whose 
a-values are between two values and whose y-values are between two functions then you 
can set up the bounds as we did above. 

It is generally sensible to find points of intersection points of curves in a collection of 
curves that bound R and then try to express the region as the points between values of 
intersection points in one variable with the other variable a function between the lowest 
and highest values of that variable over the values of the outer variable. 

Usually, graphing is actually not necessary to see the lowest and highest values of a 
variable in a region and then use the boundary curves to express the values in the region 
as between two functions of the outer variable. However, graphing is a good idea to help 
you check your work, and is a very helpful tool for seeing what the bound should be when 
it is not obvious from the boundary functions listed. 

A useful observation that is so immediate that it probably doesn’t deserve to be called a 
theorem (but which we will present as a theorem anyway so we can refer to it more easily) 
is that if an integrand can be written as a product of functions of single variables and the 
bounds of the iterated integral are all numerical, then the integral is the same as the product 
of the corresponding individual integrals in the corresponding variables. 


Theorem 12.54. Let f;(x;) be continuous on [a;,b;| for all 1 <i<n for some n € R", 


and let R= ] [les bd. Then | FiBL) oeo ot ale) = 
i=1 R 


by be bn 
/ | WEN NAC ec a eo 
a a2 an 


be bn 
Cf fuler)eer)( | folwn)dera)... i Hi@pde). 


ay a2 


Proof. Since fi(x;) for 1 < i < n—1 are all constant when integrating with respect to rn, 
we can move these functions outside of the integral sign for the integral with respect to xy, 


by pbo bn 
which gives us | | bes Fi(e1) follo)-- fn tn doy adtodr) 
ay Jag Qn 


308 CHAPTER 12. INTEGRATION IN HIGHER DIMENSIONS 


by bo bn— "Ff 
=-[ |. ee (01) fo(¥a)ss-Jna1 nat of fn(Xn)dty)dtp_1...dx2dx1, and then since 
an 
Fn Ry) 1S eoaatant relative to integration with respect to the other variables, we can take 


is fn(%n)d&y out of the integral, giving us 


a 


bn bi phe n— ‘y 
( Ful Cn abn) yf i are fi(v1) fo(x2)..-fn—1(@n—1)dtn_1...dx2dr1. We continue, 
‘L, 


bn— 
taking out | fn—1(Ln—-1)dxpn_1 out of the integral and so on, ending with 
care bi be br 
[filer fa(o2).-faln) =(/  fi(vijdai)([  fo(2)da2)...0f  fn(@n)d&n). 
a a2 Qn 


Here is an example of how this theorem can make iterated integrals simpler: 


i, Geo 
Example 12.2. Find [ i, xe¥dydx. 
0 Jo 


Solution. Since the bounds are numerical and we can represent the integrand as a function 


72 
of x (namely x”) and a function of y (namely e”), we can write [ | reldydx = 
0 Jo 


(f Paay( f evay = 


The gains from using this trick are largely notational (we don’t have to write as much 
on each step). This trick, while frequently used, only saves a little bit of time. We should be 
careful when using it that we do not attempt to use it at the wrong time. If the bounds are 
not numerical then it doesn’t make sense, and if we can’t write the integrand as a product 
of factors which are each single variable products then this doesn’t make sense either. 


1 


yer 


0 


2. 
ae 


e” = 
J=GE-Y 


Changing the order of integration: 


Changing the order of integration is a method which is frequently used for simplifying 
an iterated integral. Sometimes an iterated integral does not have a nice antiderivative that 
can be found in the order in which the integral is written, but by expressing the bounds 
for the same region in another order the iterated integral becomes more easily integrated 
because the antiderivatives are much simpler. This is mainly helpful if the region is of 
type one and two (in general, it is more likely to be a useful method for iterated integrals 
if the iterated integrable is fully projectable with respect to multiple arrangements of the 
variables). Even if the region is not both type one and two, however, it can be useful if the 
region can be broken into smaller regions which are type one and two. 

There doesn’t really seem to be a nice algorithm which is universally applicable for 
reversing the order of integration in all cases. We recommend graphing the region expressed 
by an integral’s bounds, and then looking at how to express the region as an iterated integral 


359 


in the other variable order. However, assuming we have a type one and two region we can 
(sort of) describe an algorithm for reversing the order of integration without the use of a 
graph. The advantage to graphing is that it makes it easier to see whether the region is 
type one and type two and what the higher and lower function is over each interval (or 
whether the region would be better broken into multiple separate regions with separate 
iterated integrals). 

The procedure when a region is both type one and type two is that you invert the 
functions representing the inner integral bounds if possible, determine the smallest and 
largest values of the inner variable, make those the bounds for the new outer variable and 
use the inverse functions as boundary curves to help you create bounds for the new inner 
variable integral bounds. 

Here is an example: 


4 2 
Example 12.3. Integrate [ / e” drdy. 
0 V9 


Solution. taking this integral in the current order won’t work well. We don’t have a nice 
antiderivative of e” if we are integrating with respect to x. So, we will try switching the 
order of integration to see if that helps us evaluate this integral. 

We will refer to R as the region described by the bounds of this integral, which is the 
region where 0 < y < 4, and /y < x < 2 for each value of x. So, it is the region bounded 
on the left by x = \/y which is also the curve y = x’ and bounded on the right by « = 2. 
This is the graph. 


Graph of region R 


y=0 2 


To set up the bounds for the integral in the other order, we must use x as the outer 
variable. We look for the smallest value of x in the region, which is 0, and the largest, 
which is 2, to get the bounds of the outer integral. Over 0 < x < 2 we see that the curves 


bounding the region expressed as functions of « are y = 0 and y = 2”. 


360 CHAPTER 12. INTEGRATION IN HIGHER DIMENSIONS 


2 x 
Hence, the integral becomes : | er dydx. (Note that the integrand does not change 
0 JO 


3 i 2 3 
ye = xe” da. 
0 0 
eu 


a 5 8-1 
Setting u = x® we have du = 327dz, so the integral is 3 / e“du= 3 — 
0 0 


2 
when the order of integration is switched, only the bounds). This is | ye 
0 


In the preceding example, suppose we didn’t want to graph anything. Well, we could 
have said that the inverse of x = \/y is y = x”, and we could have said that over 0 < y <4 
the smallest and largest values of = ,/y were 0 and 2, which would be mean that we 
would have 0 and 2 as the bounds for the outer x-variable integral, and then we would go 
from y = 0 (since that is the smallest value for y) to y = x? for the inner variable. We have 
to be careful when using this process, though, because reversing the order of integration 
in general can’t just be done by inverting functions, finding maxima and minima and then 
switching the integrals. Suppose, for instance, that the initial integral bounds had been 


1 
| / fdxdy for a function f. Then the inverse of x = \/y would be y = x” as before, but 
0 Y 


because the function x = ,/y does not reach x = 2 on 0 < y < 1 we would have to express 
the integral with two integrals if we switch the variables. This would be the graph: 


Graph of region R 


1 x 
So, in reversed order the integral would be d, / fdydx + [ [ fdydx. Depending 


on the integrand, this switching of variables to ere one integral for two might be a 
mistake (this might make the problem harder). 

A common error for readers who are not paying attention to the descriptions of what the 
bounds mean is to simply switch the integrals and keep the original variable representations 


361 


1 pl 
of the bounds. This does not work. Consider the integral [ / e’dydz. If we just switched 
0 Ja 


1 pl 
the integrals we would have / | e’dxdy. The outer integral doesn’t even have numerical 


a JO 
bounds so the output of the calculation would have a variable in the answer (which is 
nonsense). The only time you can just switch the integral order and keep the same bounds 
as described is if the bounds are all numerical (meaning that you are integrating over a 


b pd d pb 
rectangle). It is correct that / | f(a,y)dydx = / / f(a, y)dxdy, which is probably 


part of the reason students of calculus sometimes think they can make the same sort of 
switch when there are non-numerical bounds. 


We conclude this section by addressing how to find the volume between surfaces. 


Finding the volume between two surfaces: 


We proved in Theorem 12.53 that if g(x,y) < f(x,y) over a Jordan region R then the 
volume of the region E between the graphs of z = g(x,y) and z = f(x,y) over the region 
Ris | feu) aleadA. 

In some cases f and g might cross through each other, of course, in which case we would 


have to break R into sub-regions where one function is less than the other. Thus, in general, 


the volume between f and g over R is V = | lf(z,y) — g(a, y)|dA. 
R 


Example 12.4. Find the volume between the functions g(x,y) = « + 2y and f(x,y) = 
9— «x? —y? over the region R, where R is the region between the functions y =0 andy =x 
overO<a<l. 


2 


1 x 
Solution. This is [ i 9— a? — y* — x — 2ydydzx since f(x,y) > g(x,y) on all of R. This 
0 Jo 
3 


6 a4 


x 1 
dz = [ 9x? — x4 : a? — 2dr = 3x2? — — — 
0 0 3 4 


ry —y? 


1 
becomes: [ Oy — ay 
0 


1 967 


y 
3 


Qx° x 


Finding the average value of a function over a region: 


Definition 95 


The average value of an integrable function f over a Jordan region EF C R” is 
1 


wrarey 


362 CHAPTER 12. INTEGRATION IN HIGHER DIMENSIONS 


Finding an average value is fairly straightforward (we just plug into the formula above). 
The average value is the height at which, if we took a constant function, the integral of 
the constant function would be the same as the integral of the original function. For a two 
variable function this is easy to visualize. If we pretend that ice does not change in volume 
when it melts and becomes water (which isn’t true) then we could think of the volume 
under a surface over the region R in the xy-plane as being filled with ice. If the ice were 
then to melt but remain contained above the region R then the water level would be the 
average value of the function. 


Example 12.5. Find the average value of the function f(x,y) = xy’ over the region R 
bounded by y= Vx, y=0 and x = 4. 


Solution. It would eliminate fractional powers to use y as the outer variable in this problem, 
which might be nice. So, the smallest value of y would be zero, and the largest would be 
two (the square root of four). Then y = \/z could be changed to x = y”, so that y? < a < 4. 


y|’ 8 
a 
4 2 
3 
= 3 [ a - 
y? 8 Jo 


2 
Thus, the area of the region R would be | 4—y'dy = 4y — = 
0 0 


3 2 4 3 f? 22y? 
The average value would then be = a l xydady — / a 
8 0 y2 8 0) 2 


8y> sy" a Bo 


) | Ses 
a iA) 7 


6 
3 
Y dy = =( 


Graph of region R 


y=Veornz=y’ 


Finally, though it is not a standard term, it is often helpful to describe functions 
satisfying the conditions in Exercise 12.1, so we make the following definition. 


Definition 96 


Let f : E > R be integrable on a Jordan region E in R” so that for some 7 € 
tS ik Coors) Se Nem (by to pe — tee ey ee 2. ame 
F 2s Fie Lass in) = —f Fig, =, —17y--t,)) Then we say that 7 is odd with 


363 


| respect to x; over region R which is symmetric with respect to xj. | 


As shown in Exercise 12.1, the integral of a function f which is odd in a variable x; over 
a region which is symmetric with respect to x; is always zero. 


Three dimensional integrals are of particular interest to us in applications, largely since 
our universe seems to be three dimensional, at least in terms of the dimensions associated 
with space. Most of the development is already done for us by the theorems in the earlier 
sections (since they were proven for an arbitrary dimension). 

Cavalier’s principle is an example of a notion established by triple integrals. 


Theorem 12.55. Cavalieri’s Principle. Let E,, Eg be solids which are type 3 regions and 
are between two parallel planes z = h, z =k. If the areas of the cross sections at each height 
are the same then the volumes of the two solids are equal. 


n n 
Proof. Since, by Fubini’s Theorem, we can write this volume as lim S- Az S- / | f(a, y, 2 )dA, 
oor a= . 


where / | f(x,y, z; )dA is the area of the cross section at height z;, which is the same for 


E 
both solids, the result follows. 


As verified in earlier sections, iterated integrals with three integrations can be used to 
evaluate integrals over fully projectable (type 3) regions in R?. If we integrate the number 
1 over a solid EF then the integral is the volume of FE, though typically it makes more sense 
to do a double integral to determine a volume. However, many applications are based on 
integrating other functions over a solid. Any time you have something whose quantity per 
unit volume is known within a solid and you want to find the total amount of that thing 
over the solid you take a triple integral. For instance, if you know the mass per unit volume 
at each point within a solid then integrating that density function would give you the total 
mass of the solid. If you know the probability that an outcome will occur per unit volume 
within a solid then the integral over the solid will give the probability that the outcome 
will occur somewhere within the solid. If you know the total amount of heat energy per 
unit volume then the integral over the solid would give you the total heat energy within the 
solid. If you know the total charge per unit volume (or the force acting on a particle from 
this charge) then the integral would give the total charge (or total force on the particle) 
from the entire solid region. If you know the number of molecules of a chemical per cubic 
unit of volume within a solid region (perhaps a large portion of gas like the atmosphere 
over a city) then integrating would give you the total number of molecules of the chemical 
within the region. 


Setting up triple integrals: 
Iterated integrals with three integral signs follow a similar pattern to double integrals, 


but the graphing and interpreting the graph tends to be more difficult. It is usually good to 
begin by looking at the graph of the solid E to be integrated over and looking for a direction 


364 CHAPTER 12. INTEGRATION IN HIGHER DIMENSIONS 


onto which the solid, if projected onto the coordinate plane perpendicular to that direction, 
projects to a type one or two region D on a coordinate plane. So, if we call that direction z 
(it could be x or y, of course) then there are continuous functions hi(x,y) < he(x, y) defined 
on D so that EF is the set of all points (2, y, z) so that (x,y) € D and hi(z,y) < z < ho(a,y). 
We then look at D and set up bounds for the region as we did for a double integral over D, 
selecting an outer variable to have numerical bounds and then letting the other variable’s 
integral bounds be functions of the first variable. Note that at the end of this process the 
outer variable is numerical the second integral bounds are functions of the outer variable 
and the third (inner) integral bounds are functions of the preceding two variables. It may 
help some people to thing of the aya integral as the integrand function for a double 


integral, so ff ff sav = ff fn \dA. 


The process of integrating is the same idea as a double integral. Specifically, we treat 
all other variables to be constant except the variable we are integrating with respect to in 
each integral, plug in the bounds and then move to the next integral until all integrals are 
completed. This is illustrated in the following example: 


1 x ry 
Example 12.6. Find [ | zadzdydx. 
0 Jo Jo 


Li x Yy 
Solution. We integrate with respect to z then y and then z as follows: i | i zadzdydx 
0 Jo Jo 


1 px? 12 1 3332" 1.9 10 
z xy x x 
—— dye = = [ ee oY" aude =f —-| dr= i. dx = 


It is frequently possible to switch the order of the variables by changing the outer variable 
or the projected region D. The integrand is unaffected by this process of setting up the 
bounds. Here is an example where the bounds can be set up in six different ways because 
the solid is fully projectable in any variable order. We will set up three of these integrals. 


eee 
> 60 


Example 12.7. Let I = | f, where E is the region bounded by the surfaces z = 0, 
E 
y=9—27 andz=y. Set up bounds to evaluate this integral in three different ways. 


Solution. It is helpful to graph these solids before setting up the triple integral, so we include 
a graph below. We can see that if z= 0 and z = y then y = 0 at the intersection of these 
surfaces. We can also see that if y = 0 and y = 9 — ” then the intersection of these three 
surfaces occurs when x = +3. 

We will start by viewing the projection Dz, of vis solid onto the xy-plane as the region 
between the line y = 0 and the parabola y = 9 — x”. Then the outer venene could be z, 
where —3 < x < 3. The next variable could be y, where 0<y<9-2?, which gives the 


9—x? y 
region D,,. We could then have 0 < z < y over D. This gives us i: i, / fdzdydz. 
0 0 


-3 
One way to set the integral up with different bounds is just to use the same projection 
Dyy and reverse the order of integration on that region. This is two dimensional region, 
so it is a process we are already familiar with, and for ae) 0 < y < 9 we would have 


—/9-y<r<vV/9 , so the integral would be i. ee a fdzdydz. 


365 


For our third choice of bounds we will project E’ onto the yx-plane to a region Dy,, 
which is a region bounded by the triangle with sides z = y, z = 0 and y = 0. The largest 
value of y for this triangle is y = 9, and for every value of y we would have 0 < z < y. The 


function z is restricted only by y in the second surface, so —\/9 — x2 <x < 9-22. We 
9 ry pV9ry 
can thus express the integral as | | / fdxdzdy. 
0 Jo J-yory 


Graph of E 


Transformation of Variables 


It is frequently useful to take a particular coordinate system and change it to another, 
such as changing from rectangular to polar coordinates. We would like to be able to make 
such substitutions in a manner which is easy to manage. The following theorem is somewhat 
tricky to prove, and the arguments we give may seem a bit cumbersome. There are several 
slightly different forms of this result. For now, we are just stating the theorem. 


Theorem. Transformation of Variables. Let 6: U > R” be a one to one C* function, 
where U is open in R", det Do(x) 4 0 on U and E is a Jordan region whose closure is 
contained in U. Then | f = f o | det Dd. 

o(B) E 


This theorem is helpful because it allows us to justify many conversions of variables 
formally to coordinate systems like polar coordinates, spherical coordinates and cylindrical 


366 CHAPTER 12. INTEGRATION IN HIGHER DIMENSIONS 


coordinates (the latter two are discussed in the next section). This is much more general, 
however, and allows us to transform more complicated regions into simpler ones if we can 
come up with the right transformations to make this work. 

We first discuss why this theorem is reasonable. We will discuss this as a sequence of 
observations. 

First, observe that due to the Inverse Function Theorem, a point x in the interior of E 
will be contained in an open set contained in E whose image contains an open ball which 
contains $(x). Since ¢ is one to one it must follow that the image of any open ball on the 
boundary of F contains a point in ¢(£) and a point not in ¢(F), which means that the 
boundary of ¢(E) is ¢(0(F)). 

Second, focusing on the two dimensional case for now, let ¢(u,v) = (x(u,v), y(u, v)) in 
R?. If we were to take the edges of a rectangle Rij = [ui,ui + Au] x [vj,vj + Av] C E 
with vector sides < Au,0 > and < 0, Av > then if the derivative of the transformation ¢ 


Ox Ox 
were a constant matrix D = gu y on the rectangle, we would have that the function 
du Ov 


6) 6) O 6) 
(uj + hy, 0; + ha) = (ui, v;) = (Shu f on ho, oy hi 4 an hz) by Taylor’s Theorem (or 
U u a) 
the Mean Value Theorem for Real Valued Functions for that matter) on each component 
function. If we only look at points on the rectangle then 0 < hy < Au and 0 < hg < Av. 
This means that the image of the rectangle R,; (which is, by definition, {(u; + t; Au, vj + 
tgAv)|0 < t) < 1 and 0 < tz < 1}) under ¢ would be the parallelogram Pj; = {(x(ui, vj) + 


0 0 0 6) 
eats +  Avto, y(uj, v;) | Y Auth | ¥ Avty)|0 <t,; < land 0 < tg < 1}. In other 


words, the image of the rectangle would be a parallelogram. A similar argument would 
have shown the image of a three dimensional rectangle would have been a three dimensional 


parallelpiped. Note that the vector sides of the new parallelogram would be (Au, 7H An) 
u U 
Ox Oy 
—Av, —Av). 
and ( Fp a” 5 v) 


Third, we notice that if the derivative is not constant, but is very close to constant 
(each partial is between two very close values) on the rectangle Rj; then ¢(R;;) is contained 


between two parallelograms P;; C Rj; C Qi; that are very close to one another. In fact, 
they are so close that the areas satisfy the condition that the ratio nes can 
be made as small as we wish, as long as the derivatives are bounded betweca sufficiently 
close values. In other words, if we were to take the derivative at any point p,; € Rij and 
use this to find a parallelogram M;,; that would have been the image of R,; had the matrix 
Vol(Mi;) 

Vol($(Rij)) 
one as we wish. We can also get a similar same result for parallelpipeds. These are both 
based on the idea of establishing that if derivatives are near constant then the images of line 
segments are near the starting points plus the displacement vectors acted on by the linear 
transformations of those segments (the images had the derivative been the constant the 
derivative is near), in that the lengths of displacements between vectors are similar (their 
ratio is near one) and the angles between the displacement vectors are small. This can let 
us show that the image fits within the desired parallelograms. 

Fourth, recall that we discussed that the area of a parallelogram in R? with vector 
edges u, v is | det(u, v)| and that the volume of a parallelpiped with vector edges u, v, w is 


D¢(p;;) been equal to D¢(x) for all x € R then we can make as close to 


367 


| det(u, v, w)|, which was demonstrated in Theorem 10.4 under some minimal assumptions of 
geometry. This means that in the previous step the volume Vol(Mj;) = ArAy| det Dé(p;;)|. 
We have a more careful form of this observation for n dimensions below (not relying on 
geometric concepts of volume that we haven’t developed with proper rigor). 

Fifth, we can show that if the base rectangle were subdivided into a union of non- 
overlapping Jordan regions (instead of a grid consisting of subrectangles) then multiplying 
the volumes of the Jordan regions times the function values would give a sum that can 
be made as close as we wish to the integral if the diameters of those Jordan regions are 
sufficiently small. 

Finally, we can take a grid so that the rectangles on the grid which intersect the 
region EF and their corresponding images under ¢ have very small diameter. Since the 
partial derivatives are all continuous on the closed bounded rectangle they are uniformly 
continuous, so we can subdivide the grid into small enough rectangles so that the derivatives 
are close enough to constant on each subrectangle that the ratios of the MOM are 

Vol(o(Rij)) 
close to one on every rectangle, meaning that ArAy = k;;ArAy| det D¢(p,;)| where kj; 
is close to one for all 7,7. Thus, if we let S be the set of rectangles in the interior of E 
in this subdivision with fine mesh then S- f(¢(P;;))|Rij|| det Dé(p,;)| is very close to 

RizES 
S- f(ai;)V ol(o(Rijz)) where the p;; points are chosen as the preimages of the q;; 
{b(Rij)| Rig ES} 
(so f(q;;) means the same thing as f(¢(p,;)). Each of these sums can be made arbitrarily 
close to the two integrals in the theorem listed above, which means that the integrals are 
equal. 

This argument makes the result above make sense, but it is just an intuitive description 
and ideas like making volumes close over sums when derivatives are close takes some effort 
to formalize. 


We can use the change of variables formula to get formulas for integrals with polar 
coordinates. The maps a(r,@) = rcos(@) and y(r,@) = rsin(9) are C' maps on [0, R] x [0, 27] 
and these maps are one to one except on a set of volume zero. Hence, using the change of 
variables formula, if R is a region in the xy-plane which is the image of D in the ré plane, 
we have ff f(z,y)dA = // f(rcos0,rsin 6)| det J|dA, where J = bert A 

oo D : g sin(@) rcos(@) |’ 
so det J = rcos?(@) + rsin?(9) = r. From this it is possible to more formally derive the 
formula for polar area in one variable. If we want to find the area enclosed by r = r(@) 
and the origin over 0, < @ < 62, we can call this region R and then the area A of R is 


02 pr(O) 

found by integrating 1 over R, which means that A = / | 1dA = | i 1(r)drd0 

R Oy 0 
62 r2 1 62 

= | ey dd = 5 : (r(@))?d0, which is the formula for polar area we used in chapter 
Cal (0) 04 


six. Since this development is more formal and less pictorial (with a clear definition for 
what area means rather than relying as heavily on properties of geometry that we have not 
derived), this is a more rigorous derivation of this formula. 


r(0) 


2 2 


Example 12.8. Find lal «dA, where E is the region bounded by the ellipse ato bs 
E 


368 CHAPTER 12. INTEGRATION IN HIGHER DIMENSIONS 


4 9 
Solution. If we set x = 2u and y = 3v then we would have ut? -- a =u?+v? =1. Thus, 


the transformation described takes the unit disk D to the elliptic disk E. The Jacobian of 


the transformation is det F A = 6. Thus, the integral becomes : i: 6(4u7)dA. We can 
D 


20 
than change the integral to polar coordinates so that it becomes 24 i i r? cos”(@)drd0 = 
0 Jo 
Ail 


zh E Tr 
24 | cos*(oao | 1dr = 24(4)(5)() 


= 67: 


Example 12.9. Find Lib da? + yadA, where E is the parallelogram bounded by the lines 
E 
4a+y=0,4ea+y=2,¢%-y=l1andax—-y=4. 


Solution. We will set u = 4% + y (which ranges from 0 to 2 on the parallelogram) and 


v = x-—y (which ranges from 1 to 4 on the parallelogram). We solve for x and y, beginning 


utv 
with adding these equations to get 52 =u+uv,s0 x= ——. Subtracting four times the 


—4 
” . This lets us get the 


second equation from the first would give 5y = u— 4v, so y = . 
1 


1 1 
Jacobian det Sie which has absolute value 5° Note that the integrand is 


ye one on 


5 
4 
5 
1 2 4 1/2 
(42 + y)(x) = u( ety, Thus, the integral becomes — ) i. uw? + uvdvdu = — | uu + 


4 2 2 2 |2 
1 1 15u 1 15u 

d 4u? + 8u — + —)du= — 24 dy = —ly3 eee 

u= ne u u— (u? 5) = / 3u* + 5 du 55 (u 7 ). 55 


uv? 


2 


1 


Example 12.10. Find ioe 2xzdV, where E is the parallelpiped bounded by the planes 
E 
atytz=0,¢+y+2=2,¢2-y-2=0 andex—-y-z=1, ¢-y+z2=0 anda—-y+z =4. 


Solution. We will set u = x +y+ z (which ranges from 0 to 2 on the parallelpiped) and 
v = x«—y-— z (which ranges from 0 to 1 on the parallelpiped) and w = x—-y+z = 


0 (which ranges from 0 to 4 on the parallelpiped). We then solve for the variables. 


Adding the first two equations gives u+v = 2%, sox = a Subtracting the third 


w—v 
equation from the second gives v — w = —2z, so z = =a Subtracting the third 
equation from the first gives u— w = 2y, soy = ae The Jacobian determinant is 
then det | 


| 


ra 
Re 
ry 


1 1 
gone V the absolute value of which is Fl Thus, 


oO NIP NIE 
Nl Oo NIH 
| 
| 


369 


2 pl pd 
ee LL fo ay ae i i uw — uv + vw — 
8 Jo pee at 4 
v-dwdudu = = sf ue ide [ wdwt+ se ay ff wdw- [ udu f vdv | ldw— 
0 0 0 
2 72 3 


hh val dw) = UAE) + ODE) - Qe - ODA = 


wR oe a 


Theorem 12.56. Let 6: U > R” be C', where U is an open set in R” containing a 
compact set K, and Ag #0 on U. Then: 

(a) There is an M > 0 so that L(x, y) C K then |¢(a) — d(y)| < Mla— yl. 

(b) Let Vol(E) =0 where E CU. Then Vol(¢(E)) = 0. Furthermore, there is a 6 > 0 
and a constant C > 0 so that if a cube W has diameter less than 5 and intersects E then 
o(W) is contained in a cube of volume CVol(W). 


Proof. (a) By the Mean Value Theorem for Vector Valued Functions, if L(x,y) Cc kK CU 
then there is some c € L(x, y) so that (4(x) — ¢(y)) - Dé(e)(x — y) = |6(x) — o(y)|?. 

By the Cauchy Schwartz Inequality and Theorem 10.32, |¢(x) — ¢(y)|? < |¢(x) — (y)|- 
|D¢(c)||x — y|, so |@(x) — oly )| < -|D¢@(c)||x — y|. Also by Theorem 10.32, for each x € U 


0 
we know |D¢(x)| < /n = ic | Pi (x |, which is continuous since ¢ is C!. By the Extreme 
w=1 7=1 


) i 
Value Theorem, it follows that //n : ae | oi 
i=1 j=l my 
therefore |¢(x) — ¢(y)| < M|x— yl. 
(b) For each x € FE we can find 7% > 0 so that C,,(x) C U. Since FE is compact and 
{C,,.(x)},ep is an open cover of F, we can find a finite subcover F = {C. x, (Xi) }isi<k, and 
k 


we note that K = U Grn.) C4, (Xi) is compact and E CK CU. By part (a) we can find M > 0 


x)| takes on a maximum value M on K and 


so that |¢(x) — oy)l < M|x—y| if Lx, y) CK. 
Let « > 0. By the Lebesgue Number Lemma we can find 6 > 0 so that if a set S 
intersects E and has diameter less or equal to 6 then S is a subset of an element of F’. By 


6 
Theorem 12.27, we can find 0 << Ta and a collection of cubes Q = {Qi}i<i<t which 


covers B, pore of which has side length 7 and therefore diameter 7\/n < 5 by Theorem 12.1, 


so that 3 \Q;| < ———= Ea 


If w is any cube with eo < 6 of side length L so that W intersects E, since 
Wc Cn, (x;) for some 7 € {1,2,3,...,k}, we know that W is a convex subset of K, 
so for any x,y € W it follows that |¢(x) — d(y)| < MW/nL. Hence, for any point z € 
o(W), we know that Boxy yar(o(Z)) contains ¢(Q) and hence by Theorem 12.1, it follows 
that 6(Q) © Cyn yar (¢(2)). The volume of W is L” and the volume of Cyy, ap (¢(2)) is 
(4M y/n)"L” = CVol(W), where C = (4My/n)”. 


370 CHAPTER 12. INTEGRATION IN HIGHER DIMENSIONS 


For each 1 < i < t, the diam(Q;) < 6. For each 1 < i < t we can choose a point 


t 
2; € $(Qi). It follows that (Qi) © Carmyn(o(zi)), 80 6(E) C LJ Cammyn(2i): 
i= 1 
We know that |Q;| = 7", and |Cyrnyq(O(Z))| = 4°M"n2n” = 4°M"n2|Qi|. Thus, 
t t 
S— |Cammya(2i)| = 4"M"n? S* (Qi < 4"M"n? 


i=1 i=1 


€ 
——. = «. Hence, Vol(¢(E)) = 0. 
a (o()) 


First we note some properties of nice maps which are one to one, continuously differentiable 
and have non-zero derivative determinants. 


Theorem 12.57. Let ¢: U — R” be one to one and C', with Ag #0 on U, where U is 
an open set in R” containing E and E CR". Then 6(0(E)) = 0(¢(E)). 
Furthermore, ¢(E) = $(E). 


Proof. By Theorem 11.24, ¢ is a homeomorphism. 

Let p € O(E). Let U be an open set containing p. Then U contains a point x € FE 
and a point y ¢ E, so ¢(U) contains a point ¢(x) € ¢(£) and a point ¢(y) ¢ ¢(£). Since 
every open set V containing ¢(p) contains the image of an open set @ 1(V), it follows 
that every open set containing $(p) contains a point of ¢(£) and a point not in ¢(F), so 
o(p) € O(¢(£)). 

Since ¢ | is also continuous on the open set o( 
that if q € A(¢(E)) then 6 '(q) € A(¢ '(G(E))) = 

From this we see that ¢(E) = 6(E) U ¢(0(E)) = ¢(E) U 0(¢(E£)) = O(E). 


it follows from the same argument 


). Hence, (0(£)) = O(¢(£)). 


U), 
(E 


Theorem 12.58. (a) Let 6: U > R” be C', where U is an open set in R” containing the 
closure of Jordan region E, and let Ag 40 on U. Then o(E) is a Jordan region. 

(b) If @ is also one to one on U except on a set S of volume zero, and E and D are 
non-overlapping Jordan regions whose closures are contained in U, then ¢(E) and ¢(D) are 
non-overlapping Jordan regions. 


Proof. (a) By the Inverse Function Theorem we know that ¢(V) is open for every open 
V c U. In particular, ¢(E£°) is an open subset of ¢(£), and is therefore contained in 
the interior of ¢(£). Since E is a Jordan region we also know that E is bounded, so 
E is compact. Thus, ¢(E£) is compact since the continuous image of a compact set is 
compact, and is closed by the Heine-Borel Theorem, and contains ¢(E£). Since ¢(£) is 
the intersection of all closed sets containing ¢(E) we know that ¢(£) C ¢(E). Thus, 
0(¢(E)) = o(E) \ o(E)° C o(£) \ o(E°) C o(O(E£)). Since E is a Jordan region, we know 
that Vol(O(E)) = 0, so by Theorem 12.56, it follows that Vol(¢(O(E))) = 0 and thus 
Vol(0(¢(£))) = 0, which means that E is a Jordan region. 

(b) Since £ and D are non-overlapping we know that Vol(EMD) = 0, which means that 
Vol(¢(EN D)) = 0. Likewise, we know that Vol(¢(S)) = 0. Let y € (¢(£) Ng(D)) \ o(S). 
Then there is a unique point x € U so that ¢(x) = y, which means that x € END. It follows 
that (¢(Z) 1 e(D))\ o(S) © o(END) and thus (¢(£) 9 o(D)) € (6(S) U@(EN D)). Since 


371 


the union of two sets of volume zero has volume zero, we know that Vol(¢(E)  ¢(D)) = 0 
and therefore ¢(£) and ¢(D) are non-overlapping. 


Theorem 12.59. (a) Let R be a rectangle in R" and let e > 0. Then there is a collection Q 
of cubes whose interiors cover R so that Vol) Q\ R) <e«. Likewise, there is a collection 


W of non-overlapping cubes contained in R so that Vol(R \ U W) <e. 
(b) A set S in R” has Lebesgue measure zero if and only if for every « > 0 there is a 
[o-e) 


collection of cubes {C;}ien whose interiors cover S so that se ICi| <e. 
i=1 
(c) Let 6: U > R” be one to one, C! and have A)¢ #0 on U. Let E be a Jordan 
region so that E CU and X(E) =0. Then \(¢(E)) = 0. 


Proof. (a) First, let AK be a cube containing R. By Theorem 12.29 we know that there 
is a 0 > 0 so that if a grid G on K has mesh less than 6 then for the function f(x) = 1 
on R we have U(f,G) — L(f,G) = V(R,G) — o(R,G) < «. Let G = {Ri}i<i<, be a 
grid on K consisting of cubes (obtained by dividing each edge factor of K into the same 
number of evenly spaced subdivisions) with |G| < 6. Then we know that Vol(R) —e€ < 
v(R, G) < Vol(R) < V(R,G) < Vol(R) + «. By Theorem 12.7 we can replace each R; € G 
be a cube Q; so that R; C Q? and Vol(Q;) — Vol(Ri) < = WES) = VOK)) 
k k 
Ss \Qi| < Vol(R) +e and RC U Q>. 
By Theorem 12.7. re 
(b) If there is a such a collection of cubes for each « > 0 then by definition we know 
that A(S) = 0. Assume that \(S) = 0 and let « > 0. Then we can pick a collection of 


, so that 


[oe) 
€ 
rectangles {R;}ien covering S so that S- |Ri| < 5° By (a), for each 2 we can pick a collection 


i=l 
Ci = {Wj }1<j<n, of cubes whose interiors cover R; so that Vol) C; \ Ri) < ae Hence, 
co Ni CO CO F 
SSW <I + 5h <e 
j=1 j=l j=1 i=l 


(c) By Theorem 12.56 we know that there is ad > 0 anda C' > 0 so that if W is a cube 
of diameter less than 6 and WN E # 0 then ¢(W) is a contained in a cube of volume no 
larger than CVol(W). 

Let \(£) = 0. Then we can find a collection of cubes {Q;}ien which covers E’ so 


that the diameter of each Q; is less than 6 and » lQi| < a For each Q; we choose a 
i=1 
cube R; containing ¢(Q;) so that Vol(R;) < CVol(W). Then {Ri}ien covers o(£) and 


Ss" |Ri| < Ca =e. Hence, A(¢(£)) = 0. 
i=1 


372 CHAPTER 12. INTEGRATION IN HIGHER DIMENSIONS 


Theorem 12.60. Let R = ] [lesb be an n-rectangle. Let T : R — R” be the linear 
i=1 

transformation defined by T(e;) = ex, T(ex) = ej; for some j,k € {1,2,3,...,n}, and 

T(e;) = e; for alli € {1,2,3,...,n}\ {j,k}. Then Vol(T(R)) = Vol(R) and T(R) is a 

rectangle. 


Proof. A point (21, %2,...,%n) € R if and only if 
(Bis EQ, e+) VJ-1, Uk, Vj41, +) VE-1, V7, UK41, ++, Ly) — T(R). Hence, T(R) = (a1, bi] x [a2, by] x, 
ng OS [areas Oe] x (ax, dg] x lagqay Det i]s: x [Qp—1, dbg_-1| x [a;, bj] X .. X leper Dale Thus, 
VolT(R))= [[@: —a;) = Vol(R) since switching the order of multiplication does not 

i=1 
affect. a product. 


Next, we observe that by using properties of determinants and elementary matrices 
we can show that the image of a rectangle under a linear transformation represented by 
multiplication by a matrix has a volume equal to its original volume times the determinant 
of the matrix (we also observe that translations don’t change volumes since this will be 
useful later). This is done in a few steps. 


Theorem 12.61. Let R = ] [lee be an n-rectangle. Let T : R — R” be the linear 
i=1 

transformation defined by T(e;) = ke; for some non-zero constant k, and T(e;) = e; for 

i# J. Then Vol(T(R)) = |k|Vol(R) and T(R) is a rectangle. 


Proof... First, a: point: (21, 29,0004 ,250_) Ci dnd only il (ai, G9; .%, hE) CLR). 
Hence, T(R) is a rectangle whose edge factors are [a;,};| in all coordinates except for the 
jth coordinate, and whose jth coordinate is between ka; and kb;. If k > 0 then the jth 


edge factor is [ka;, kb;] so that volume of T(R) is k[ [i —a;) = kVol(R) = |k|Vol(R). If 
i=1 


k <0 then the jth edge factor is [kb;, kaj], so the volume is Vol(T(R)) = [[@ — aj) = 


—kVol(R) = |k|Vol(R). 


n 
Theorem 12.62. Let R= [[la:. b;] be an n-rectangle. Let T : R” — R” be the translation 
i=1 
T(z) = 2+ ¢, for some constant c = (C1, C2,C3,-..,Cn) € R”. Then Vol(T(R)) = Vol(R) 
and T(R) is a rectangle. 


Proof. By definition, T(R) = | [lai + ci, bi + ei] and Vol(T(R)) = [[ (bi +c — (ai — i) = 


i=1 i=1 
n 


[[@ - a) = Vol(R). 


i=1 


373 


Theorem 12.63. Let R = ] [lesb be an n-rectangle. Let t € R and let T: R > R" be 
i=1 

the linear transformation defined by T(e;) = ej + te, for some j,k € {1,2,3,...,n}, and 

T(e;) = e& for alli € {1,2,3,...,.n} \ {7}. Then Vol(T(R)) = Vol(R). 


Proof. If t = 0 then this is the identity map, and the result follows. Assume that t > 0. Let 

D= II [a;, bj]. Since T does not change any coordinate except for the jth coordinate 
{1<i<nliAs} 

on a point, all elements of T'(R) have coordinates other than their jth coordinate in D. Since 

T(R) = eae wey UG, deste) a satan veep LG—-1, Vj41, sath) € D and aj; +tXp, Saji b; + tap}, 

T(R) is a Jordan region by Theorem 12.53. 

For ax <c<d< dx, we set Rig = {x € R\c < xz < d}. Define Qied) = {x € R"|a; < 
zi <b ift € {1,2,...,n} \{7,k}, e< vy < d and ct+a; <a; < dt +b;}. Then Qa isa 
rectangle which contains T(Rj-,q)) since ct + a; is the smallest value of x; + tx, and dt + b; 
is the largest value of x; + ta, if x, € [c,d] and x; € [a;, b;]. 

If d—c is small enough that dt — ct < bj — aj then we also define Wi,.q = {x € R"lai < 
te SS De oS A, 2 oN LI RC Soe Sd and Ob ay = @y ct e6 tT this 
case, Weg) C T (Rica) because for all xz € [c,d] it is true that ct < ta, < dt. Hence, if 
dt +a; <2; < ci +07 then a; +tzy, < 2; < bj + trp. 


bp — b; —a; 
Choose m € N so that A, = oe ae! “3 


so Amt < b; — aj. Let P 


{ax + iAmfbo<i<m be a partition of [ax,b,]. Then U Wie A Day aan SL) 
i=1 


IM 


m 

() Qlay+(i-1)Am,antiAm|]: Since Wia,+(é-1)Am,antidm]  Whay+(i)Amaet(i+1)Am] Consists of 

points whose x; coordinate is a, + iAm, which is a subset of the boundaries of both 
rectangles (and volume zero) for each 1 < i < m-—1, it follows that the collection 

{Wrap +(i-1)Aim,ax-+iAm|t1<i<m of rectangles is non-overlapping. Likewise, {Qja,+(i-1)Am,ax-+iAm]t1<i<m 
is a collection of non-overlapping rectangles. Similarly, the {Rja,+(—1)Am,ap+iAm]ti<i<m is 

a non-overlapping collection of rectangles which means that {T(Rja,+(i—1)Am,ax-+iAm| t1<i<m 

is a non-overlapping collection of Jordan regions by Theorem 12.58. 


Hence, ye |Wiay+(i—-1)Am,antiAml < Vol(T(R)) < SS |Q [ay+(i-1) Am ax-+iAm]|- The edge 


i=l i=1 
factors of the rectangles Qja,+(i-1)Am,ax+iAm] have lengths 6; — a; if i € {1,2,3,...,n} \ 
{j,k}. The kth edge factor is length A,,. The jth edge factor has length b; + (az + 


iAm)t — ((ag + (¢ — 1)Am)t + a;) = (b; = aj) + tA,,. Thus, SS Qian, octet | — 
i=1 


( II (6; — a4))(S> Am)(bj — 0; +tAm) = (0; a; +tOm)( [[ Gia) 
i€{1,2,....n}\{j,k} i=1 t€{1,2,....n}\ {7} 
= Vol(R) + tayn( II (b; — a;)). Since we can make A,,, as small as we wish by 
i€{1,2,....n}\{7} 

making m sufficiently large, it follows that Vol(T(R)) < Vol(R). 

The edge factors of the rectangles Wig, +(i-1)Am,ax+iAm] have lengths 6; — a; when 
i € {1,2,3,...,n} \ {7,k}. The kth edge factors has length A,,. The jth edge factor 
has length (b; — aj) + (an + (i — 1)Am)t — (ay + iAm)t = (b; — aj) — Amt. Thus, 


374 CHAPTER 12. INTEGRATION IN HIGHER DIMENSIONS 


So |Whag+(é—1) Aman tiAm|! = ( I] (b; — a:))()) Am)(bj — aj — tm) = (bj — a5 + 
i=l i€{1,2,....n}\ {j,k} 
tAm)( II (b; — a;)) = Vol(R) — tay, ( II (bj — aj)). Since we can make 
#€{1,2,....n}\{5} #€{1,2,....n}\{9} 
Am as small as we wish by making m sufficiently large, it follows that Vol(T(R)) > Vol(R) 
and so Vol(T(R)) = Vol(R). 
In the case where t < 0 the argument is similar. 


Theorem 12.64. Let 6: U + R", where U is open in R". Let E C U for some Jordan 
region E. 

(a) If @ is the translation $(a) = «+ ¢ for some c € R”", or the linear transformation 
defined by o(e;) = ex and d(ex,) = e; and ¢(e;) = e; for all other i € {1,2,3,...,n}, or 
the linear transformation defined by (e;) = e; + te, for some j,k € {1,2,3,...,n}, and 
o(e;) = e; for alli € {1,2,3,...,n}\ {7}. Then Vol(¢(E)) = Vol(E). 

(b) If @ is the linear transformation defined by ¢(e;) = ke; for some non-zero constant 
k, and $(e;) = e, fori #7. Then Vol(¢(E)) = |k|Vol(E). 


Proof. Note that ¢(£) is a Jordan region by Theorem 12.58. Let € > 0. 
(a) We know @ and ¢ | take rectangles to Jordan regions of the same volume by 
Theorems 12.60, 12.62, 12.63, and 12.58. Choose a finite collection of rectangles {Ri }i<i<k 
k 


which covers E' so that SS |Ri| < Vol(E) +. Then {¢(Ri) }i<icg is a cover of ¢(£) with 
i=1 


» |o(R;)| = y |Ri| < Vol(E) +. Since this is true for all « > 0 we conclude that 
Volld(E i teas Vol(E ) by Theorem 12.24. Similarly, we can find a collection of rectangles 
t 
{Qihi<i<¢ that covers ¢(E£) so that y |Qi| < Vol(¢(E)) +e. Hence, {¢7'(Q;) }1<ict covers 
i=1 
E and = | 1(Q;)| =< Vol(¢(E)) + €. Since this is true for all « > 0 we conclude that 
Vol(E) = Vol(#(E)). Hence, Vol(¢(E)) = Vol(E). 
(b) By Theorem 12.61 we know that for any rectangle R C U, the image ¢(R) is a 
1 
rectangle and Vol(#(R)) = |k||R|. Note that ¢~'(e;) = aor and @ !(e;) = e; for i F j. 
1 
Thus, for any rectangle R it follows that ¢~'(R) is a rectangle and Vol(@~‘(R)) = alae 


k 
Let {Ri}i<i<, be a collection of rectangles which covers E so that De |R;| < Vol(E)+e 
i=1 


k 
Then {¢(R;) }i<i<, is a cover of o(£) with 3 |o( R;)| = Ss |k||R;| < |k|Vol(E)+|kle. Since 


i=1 
this is true for all « > 0 we conclude that Vol(#(E)) < |k|Vol(E). 
Similarly, we can find a collection of rectangles {Q;}i<i<; that covers ¢(E) so that 


375 


t 
> |Qi| < Vol(o(E)) +¢. Hence, {67 '(Q;) }i<i<t is a collection of rectangles that covers E 


i=1 


t t 
I 1 1 
so that ) \o-1(Qi)| = il ) sl << ig’ ee) + a Since this is true for all « > 0 
=I i=1 


we conclude that Vol(E) < qV ool) so |k|Vol(E) < Vol(¢(E). Hence, Vol(¢(E)) = 
|k|Vol(E). 


Theorem 12.65. Let E be a Jordan region in R", and let T(x) = Ax for each « € R", 
where det(A) #0. Then Vol(T(E)) = | det(A)|Vol(£). 


Proof. Since det(A) 4 0 we can write A = FE) Eo...E,, where EF, F,...,E, are elementary 
matrices by Theorem 14.13 and for all 1 <i < k, & is a matrix with corresponding 
linear transformation T(x) = E;x of one of the following forms: a matrix obtained by 
switching the ith row and jth row of the identity matrix (multiplication by which gives a 
linear transformation that interchanges the ith and jth standard basis vectors and takes 
all other standard basis vectors to themselves), a matrix obtained by adding the ith row 
times a number & to the jth row (multiplication by which is the linear transformation 
T(e;) =e; + ke;), or a matrix multiplying the jth row of the identity matrix by the non- 
zero number k, multiplication by which gives the transformation defined by T(e;) = ke; 
and T(e;) = e; fori Fj. 

Applying Theorem 12.64 k times, we see Vol(T(E)) = | E1||£||E3|...|/Ex|Vol(E). This is 
equal to | det(A)|Vol(£) since the product of the determinants of matrices is the determinant 
of the product of those matrices by Theorem 14.13. 


The following is a development with aspects paralleling that found in Buck’s Advanced 
Calculus text with some parts paralleling part of the approach found in Wade’s Advanced 
Calculus, though it is also significantly different in many respects. Despite the differences 
in the approach, a student would probably benefit from looking at the proofs for Change 
of Variables given in those two texts in order to improve context and perspective to the 
development below. 


Definition 97 


Let U be an open set in R” and let F: J ~ R, where 7 is the set of all Jordan 
regions whose closures are contained in U. Then we say that fee HG =i at tor 
p 


every € > 0 there is a 6 > 0 so that if R is a cube in U containing p and diam(R) < 6 
then |F(R) — L| < «. We say that F is additive if F(E, U Eo) = F(£i) + F(E2) 
whenever FE; and EF; are disjoint. We say that F is monotone if for any pair of Jordan 
regions E}, FE, so that E, C Ez it is true that F(E)) < F(E2). 
lf f(p) = i F(R) exists for all p € D for some set D C R” then we say that 
Pp 


a F(R) = f(p) uniformly on D if for every € > 0 there is some 6 > 0 so that if R 
Pp 
is a cube containing p and diam(R) < 6 then |F(R) — f(p)| <e. 


376 CHAPTER 12. INTEGRATION IN HIGHER DIMENSIONS 


F(R 
We say that F is differentiable on D if F’(p) = ae ae exists at each p € D. 


We say that Fis uniformly differentiable if this limit exists uniformly on D. 
We say that F' is volume continuous if F(E) = 0 for every Jordan region E C U 
so that Vol(E) = 0. 


We may refer to functions like F’ described above as set functions to indicate to the 
reader that their domains are sets of points rather than individual points in R”. 


Theorem 12.66. Let f :U > R be continuous, where U is an open subset of R", and let J 
be the set of all Jordan regions contained in U. Let F : J > R be defined by F(E) = | f. 
E 


Then: 
(a) F is additive and volume continuous with F'(p) = f(p) for each p € U. 
(b) F is uniformly differentiable on any compact set K CU. 
(c) If f is non-negative then F is monotone. 


Proof. (a) We know F is additive by Theorem 12.44. Since : f = 0if Vol(E) = 0, F is 
E 


also volume continuous. 
(b) Let p € U. Given any « > 0 we can find 6 > 0 so that if |p—x| < 6 then 
| f(x) — f(p)| < €, so if R is a cube containing p and diam(R) < 6 then f(p) —e < f(x) < 


f(p) + for allx € R. Thus, (f(p) — 6)Vol(R) < | f < (f(p) + 6)Vol(R) by Theorem 
R 


F(R) F(R) 
12.44. Hence, f(p) —¢« < Vol(R) coi ua 


On a compact set K C U we can (by the Lebesgue Number Lemma) pick 6, > 0 
so that any set of diameter less than 6, which intersects K is a subset of U. Since K is 
compact, f is uniformly continuous, which means that we can make the choice of 0 < 6 < 61 
as above so that if |p — x| < 6 then |f(x) — f(p)| < ¢ for all p € K, and therefore 
f(p)-ex< Vol(R) < f(p) +e for all p € K and cubes R so that diam(R) < 6, which 
means that F is uniformly differentiable on Kk. 

(c) If f is non-negative then if F,, £2 are Jordan regions with FE, C E>) C U then it 


follows that f f=0,s0 | fey f+ f ref f. 
Fo\E\ EF Ey E\ FE, Ey 


The following is a little like an analogue for the fundamental theorem of calculus for 
functions on sets as defined above. 


Theorem 12.67. Let F: J > R be additive and volume continuous, where J is the set of 
all Jordan regions in an open set U C R”, so that F is differentiable on U and uniformly 
differentiable on compact convex subsets of U, with F'(p) = f(p) for each pe U. Then f 


377 


is continuous on U and F(Q) = | f for every cube Q CU. If F is non-negative on some 
Q 
open set V so that V CU then F(E) = i f for every Jordan region E CV. 
E 
Proof. We first show that f is continuous. Let « > 0. Let p € U. Then p € C,(p) Cc U 


for some r > 0. Since C;.(p) is compact and convex, F' is uniformly differentiable on C;.(p). 
Hence, we can choose 56; > 0 so that if R is a cube with diam(R) < 6; and x € RNC,(p) 


F 
then IZ oH — f(x)| < 5 We can choose a cube W centered at p with diameter less 
) 
than 6; containing a ball B,(p) for some y > 0. If |p—x| < 7 then x,p € W, so 
F(W F(W 
lf(p) — f(x)| < |f(p) ene aa f(x)| < «. Hence, f is continuous at each 
point p € U. 
We define a function G: J > R by G(E) = f. By Theorem 12.66, we know 


E 
that G is differentiable on U and uniformly differentiable on compact subsets of U, with 
F'(p) = G'(p) = f(p) for all p € U. 
Let Q be a cube contained in U. Then Q is compact and convex so we can find a 6 > 4 


F 
so that if R is a cube containing a point p of Q and diam(R) < 6 then | f(p) — V mpl <5 <i 
G(R) € F(R) G(R) 
F(R)—- G(R Vol(R). 
and [f() — Garayl < $9 8° laacpy — parc! <6 82 IFLR) ~ GR) < eVoI(R) 
k 


Let G = {Ri}i<i<pz be a grid on Q consisting of cubes, with |G] < 6. Let S = U O(R:). 
i=1 
Then Vol(S) = 0 ae since F' is eave and ees continuous we know that F(Q) = 


F((J Ri) = F(S) + So F(R?) = So F(R?) = 57 F(R) By definition, G(Q =f f= 
i=l i=1 

k k 

S" G(Ri). Thus, F(Q) — G(Q) < € = Vol(Ri) = eVol(Q). Since this is true for all € > 0 

i=l i=1 

it follows that F(Q) = G(Q). 

Next, assume that F is non-negative on V and E C V C V C U. Suppose that 


If(P)| 


f(p) < 0 for some p € V and choose t > 0 so that if |x—p| < t then | f(x) — f(p)| < 5 
FS oo 


Thus, if Q is a cube containing p which is contained in B;(p) then 


a= f< </ FP) io) < 0, which is impossible. We conclude that f is also non-negative. 


Hence, oe ae 12.66, we know that F’ is monotone. 

Let E be a Jordan region. By Theorem 12.49 we can find an 7 > 0 so that any grid 
on a cube containing EL whose mesh is less than 7 has the property that all upper and 
lower sums, inner sums and outer sums are within a distance ¢ of the integral of f. We can 
also chose 7 so that any set of diameter less than 7 intersecting E is a subset of U. Let 
H = {Qi}i<i<m be a grid consisting entirely of cubes on a cube containing EF with |H| < 7. 
Since f is non-negative, 0 < m; < M; for each 1 <i<m. 

Since F’ is monotone, it follows that FU I(£,H)) < F(E) < FU O(E, H)) since 


Ji(£,H) C EC | JO(E,H)). Hence, [is -e< L(f,H)Y = S> mVol(Qi) < 


Qi€l(E,H) 


378 CHAPTER 12. INTEGRATION IN HIGHER DIMENSIONS 


Si = > F(Q)=F(UIE,A))< F(E)<F(UOLA))= YS FQi)= 


Qicl(E,H)” ** Qiel(B,H) QicO(E,H) 


Ss" ‘eS S- M,Vol(Q;i) = U(f, H) < | f te. Thus, we see that |F'(E) — 
B 


Qi€O(E,H) ” QiCO(E,H) 


[ fi<ctoratle>0,s0 f f= P(e). 
B E 


Theorem 12.68. Let 6: U > R" be C', one to one and have Ag £40 onU. Let C2,(z) CU 
1 
and let0O<s< 3 Let |(x) — a| < sr for all x € Co,(z). Then Co,_5)(Z) © O(Car(z)). 


Proof. Let q € Cop(1-s)(z). Set w(x) = |o(x) — q| on C2,(z). Since w is continuous it takes 
on a minimum value m on C2,(z). Note that |¢(q) — q| < sr som < sr. For any point 
y € O(C2,(z)). Then y; = z; +r for some j € {1, 2,3, ...,n}. Since |d(y) — y| < sr we know 
lo(y); — yj| < sr so either $(y); > 2; +r — sr or $(y); < 2; —r + sr, which means that 


1 
ly; — z;| > (1—)r. Since (1 — s)r > sr (because s < 5) it follows that Y(y) > m and 


therefore ~ cannot take on a minimum on the O(C>2,(z)). This means that there is some 
p € C2,(z)° so that w takes on a minimum at p. 


Let uy, U2,...,Un be the variables for ¢ and 2,%2,...,%, be the variables for g(x) = 
n 


SS —q)*. Because w? = g((x)) takes on a local minimum at p, it follows that all 
i=1 


partial derivatives of yy with respect to variables u1, ug2,...,Un are zero at p. Thus, 


Ox Ox 

(x1 — a1) 5 -() + 22 ~ 92) 5 = (P) +» + 2(0n — dn) 5 () = 
0 2) OXn 

(1 — a1) 5 (P) + 2(a2 — a2) 5 > (P) + + 2(tn — dn)5™ (P) = 

a) a) OXn 

(1 — q1) —(P) + 2(ar2 — a2) = (p) +. + An — dn) = (D) = 0. 
Ou Oun Ou 


Since Ag(p) 7 0 this system has a unique solution, namely x; = q for each i, which 
means that $(p) = (71, £2, ...,%n) = q and therefore C2,(1_5)(z) © o(C2,(z)). 


For the next theorem, we will want to distinguish between functions from smaller 
dimensions fixing certain coordinates and functions from spaces of larger dimensions. Similar 
to notation used previously, we will use the notation (x, y) represents the vector whose first 
n coordinates are the coordinates of x and whose last n coordinates are those of y). 


Theorem 12.69. (a) Let 4,€ >0. If x, y,w,z€ R” and |x— w| < & and |y— 2 < © 


then |(w, y) — (w, 2)| < i +6. 
(b) Let U,V be open sets in R”. Then U x V is open in R?”. 
(c) Let A,B be closed in R". Then A x B is closed in R?”. 


(d) Let K,M be a compact subsets of R". Then K x M is a compact subset of R2”. 


n 

Proof. (a) Since Dee —wj;)* < e? and Lu — x)? < dé, Se — wi)? + So (yi — 2)’ < 
i=l i=1 

ef + 8, so |(x,y) — (w,z)| <1/e +e. 


(b) If U or V is empty then the cross product U x V is empty and therefore open. 
Otherwise, let (x,y) € U x V. Then x € U and y € V. Hence, we can find e€x,ey > 0 
so that B..(x) C U and B,(y) C V. Let m = min{ex,ey}. Let (w,z) € Bm(x,y). 
Since |(x,y) — (w,z)| is greater than or equal to both |x — w| and |y — z|, we know that 
x —w| < msow€ B,,(x), and similarly, |y—z| <msoz€ B,(y). Therefore Bn(x,y) 
(B..(x) x By(y)) CU x V which is open. 

(c) If A, B are empty then their product is empty and therefore closed. Otherwise, note 
that since A,B are closed R” \ A is open and R” \ B is open. By part (b) we know that 
(R” \ A) x R” and R” x (R” \ B) are open in R?”, which means that W = ((R” \ A) x 
IR”) U (R” x (R” \ B)) is also open. Note (x,y) ¢ A x B if and only ifx ¢ A or y ¢ B. If 
x € R” \ A then (x,y) € (R” \ A) x R” and if y ¢ B then (x,y) € R” x (R” \ B). Thus, 
IR?" \ (A x B) = W is open, and therefore A x B is closed. 

(d) By the Heine-Borel Theorem, kK and M are compact if and only if they are closed 
and bounded. Since kK and M are bounded we can find numbers r, s > 0 so that K C B,(0), 
and M Cc B,(0). Then by part (a) we know K x MC B ya73(0) in IR?” which means that 
kk x M is bounded. 

By part (c), we know that K x M is closed. Hence, by the Heine-Borel Theorem we can 
conclude that K x M is compact. 


Theorem 12.70. Let p € U, an open set in R". Let 6: U > R®” be one to one and C' 


with Ag £0 on U. Then lim Vol(o(R)) = |Ag(p)|. Furthermore, if K is a convex compact 
Rip Vol(R) 
Vol(¢(R)) 


subset of U then ai = |Ag(p)| uniformly on K. 
P 


Vol(R) 


1 
Proof. Let K be a compact convex subset of U and let 0< 5s < 5" Let «= aa 
For each p € J we define wp(x) = (Dé(p))~'¢(x). Then ~ is Ct and Dyp(p) = I. 
Define gp(x) = (x) — v(p ne p, and define fp(x) = gp(x) — x. Let hp(x) : U > R be 


defined by hp(x) = /n ss yi | in Be 
i=1 1 

For each of these fasion ee Ip, fp, hp : U — R". We define corresponding functions 
W(p, x), 9(P, x), f(p, x) : Ux U > R" and h(p,x) : Ux U > R defined by 7)(p, x) = vp(x), 
g(p,X) = gp(x), f(p,x) = fp(x) and h(p,x) = hp(x). Note that Dfp(p) = Onxn and 
fp(p) = 0 for each p € R”. These functions are all continuous since the partial derivatives 
of w are continuous, which means that, in particular h(p,x) is uniformly continuous on the 
set K x K, which is compact by Theorem 12.69. Thus, we can find a number 6; > 0 so 
that if |(y,z) — (p,x)| < 6; then |h(y,z) — h(p,x)| < «. More specifically, for any p € Kk 
it is true that if |x — p| < 6; then |hp(p) — hp(x)| < €. Since hp(p) = 0 this means that 
hp(x) < € for all x € K so that |x — p| < 61. 


Fix some p € K. Let C,(q) be a cube of side length less than 6 = a which contains 
n 


380 CHAPTER 12. INTEGRATION IN HIGHER DIMENSIONS 


p. Then C,(q) C B;,(p) which means that h(x) < € on C,(q). If x,y € C,(q) th 
L(x,y) C C,(q) since cubes are convex, which means that |fp(p) — fp(x)| = | fp(x) 
q 


| 
( ae hp(w))|p — x| by Theorem 11.18. This is less than «|p — x| < e(diam(C;,(q))) 
we q) 


eviir = vir = sr. 


This means that |gp(x) — x| < sr for all x € C,(q). Hence, by Theorem 12.68, we 
know that gp(C;-(q)) contains a cube C;.;_,)(q). We also know that ~(C;,(q)) is a Jordan 
region by Theorem 12.58. Hence, Vol(C,”1s)(q)) < Vol(gp(C,(q))). Translations preserve 
volumes on sets by Theorems 12.62 and 12.64, so Vol(gp(C;1~s)(a))) = Vol(w(C,a_s)(a))), 
so we also know that Vol(C,(1_s)(q)) < Vol(#(C,(q))). 


If y € C,(q) then u-5 <4; < u+5 for each j € {1,2,3,...,n}. Since |gp(y) —y| < sr 


rl ~ 8 


we know |gp(y)j — yj| < sr so qj — I —sr<yj—sr < gply)3 <yjtsr<ajt 5 + sr. 
Hence, it follows that gp(C,(q)) C Cri142s)(q)- Thus, Vol(gp(C;(q))) < Vol(C,(1428)(a)), 
so Vol(w(C,(q))) < Vol(C,4+2s)(q)). Hence, if C;(q) is a cube whose diameter is less than 
Vol(C,a—s) (q)) < Vol(w(C;(q)) y Vol(C,(142s)(q)) 
Vol(Cr(q)) ~ Vol(Cr(q)) ~ — Vol(C;,(a)) 


the left and right inequalities can be made as close as we wish to one by making s small 
enough since Vol(C,”_s)(q)) = (r(1 — s))” and Vol(C,(142s)(a)) = (r(1 + 2s))", which are 


6 which contains p then . Note that 


Vol(W(R 
both continuous functions of s, so lim Baloo) = 1. Since the choice of 6 was independent 
Rlp Vol (R) 
R 
of the choice of p € K it is also true that ae ae = 1 uniformly on K. 
Since |Ag(x)| is continuous on the compact set K, by the Extreme Value Theorem we 
Vol(W(R 
can find an M > 0 so that M > max |Ag(x)|. Let €; > 0. Since i = 1 


uniformly on K, we can choose 6 > 0 so that for any p € K, if R is a cube of diameter less 


than 6 and p € R then 1 UV eee <14 | Since w(x) = (D¢(p)) 1 (x), by 


Theorem 12.65 Vol(7(R)) = Vol((D¢(p))~'¢(R)) = | det(D¢(p))~'|Vol(¢(R)). We also 
know that det(D@(p))~! = 


As(p) ro 
€] €] 1 Vol(@(R €1 €1 
Thus, 1 <1 < < 14 <14 which means 
|As(p)| M — |Ag(p)| Vol(R) M [As(p)| 


Vol(4(R)) _ Vol(¢(R)) 
A herefore | =|A 
that |Ag(p)| — 41 < Vol(R) < |Ag(p)| + 41, and therefore lim FTC) Vol(R) = |Ag(p)| 


uniformly (since the choice of 6 depended only on M and K and not on p) on every 
compact convex subset of U. 


Theorem 12.71. Let U be an open set in R” and let ¢: U > R” be one to one and C! 
with Ag #0 on U. Then there is a volume continuous, additive, monotone set function 
F: J — (0,00), where J is the set of all Jordan regions whose closures are contained in U 
so that F(C) = Vol(¢(C)) for every cube C CU. 

Such a function F must satisfy F(E) = Vol(d(E)) for every Jordan region E whose 
closure is contained in U. 


381 


Proof. First, define F(E) = | 1 = Vol(£) for every Jordan region E’ whose closure 
E 


is contained in U. Then F' (ye Vol(C) for every cube C C U, and F is positive and 
monotone. Since Vol(¢(£)) = 0 if EF is a Jordan region of volume zero whose closure is 
contained in U by Theorem 12.59, F’ is volume continuous. Hence, such an F' exists. 

Next, assume that F : J — [0,00), where J is the set of all Jordan regions whose 
closures are contained in U, is a set function so that F(C) = Vol(@(C)) for every cube 
C Cc U. Let E be a Jordan region whose closure is contained in U. By the Lebesgue 
Number Lemma we can find a 6; > 0 so that if S is a set of diameter less than or equal to 
6, which intersects E then S Cc U. By Theorem 12.56, we can find an m > 0 and a 62 > 0 
so that if a cube C has diameter less than 52 and intersects F then ¢(C) is contained in a 
cube of volume less than mVol(C). 

By Theorem 12.46 can find 63 > 0 so that if H is a grid on a rectangle containing E 
with |H| < 63 then V(0(E), H) < = 

Choose a grid G = {Ri}i<i<z on a cube ee E so that |G| < min{d1, 52, 63}. 


Then [J (Ri) C¢(E),s0 SY) Vol(o(Ri)) = F(JL(E,@) < F(E) since F is 
R,€I(E,G) R,€1(E,G) 
monotone. Since |)  4(Ri) 2 ¢(E),weknow > Vol($(Ri)) = F(LJO(E,G)) = 
Ri€O(E,G) Ri€O(E,G) 
F(E). Also, weknow that S>  Vol(¢(Ri))— S> Vol(¢(Ri))= > — Vol(o(Ri)) < 
Me €O(E,G) Ri me " R,€S(0(E),G) 

m Vol(R;) < —m =e. Thus, Vol(¢(E) Vol(¢(R;)), Vol((R;))| 

ite mi Ri tes G) poeta 


and hence |Vol(¢(E)) — F(E)| < ¢. Since this is true for all « > 0 it follows that 
F(E) = Vol(#(E)). 


Theorem 12.72. Let U be an open set in R” and let ¢: U + R” be one to one and C' 
with Ay #0 on U. Let E be a Jordan region so that E CU. Then Vol(¢(£)) = | iS 
o(B 


i |Ag|. 
EB 


Proof. Let F : J — |0,co), where J is the set of all Jordan regions whose closures are 
contained in U, be a volume continuous, additive, monotone set function so that F'(C) = 
Vol(¢(C)) for every cube C C U. Then F is additive and volume continuous and monotone. 


We know / 1 = Vol(¢(E)) by Theorem 12.44. We know that F(E£) = Vol(¢(E)) for 
o(E) 


every Jordan region FE whose closure is contained in U (and that such a function F' exists) 
by Theorem 12.71. 

By Theorem 12.70, F' is uniformly differentiable on any compact subset of U with 
derivative F’(p) = Ag(p). Hence, by Theorem 12.67 for every Jordan E region whose 


closure is contained in U, F(E) = | |Ag| = Vol(¢(E)) = | 1. 
E o(E) 


Theorem 12.73. Change of Variables (or Transformation of Variables). Let E C U, 
where E is a Jordan region and U is open in R”. Let 6 : U — R” be a one to one 


382 CHAPTER 12. INTEGRATION IN HIGHER DIMENSIONS 


continuously differentiable function so that Ag #0 on U, and let f be integrable on o(E). 
Then | f=) fo dlAgl. 
o(E) E 


Proof. First, we show that fo ¢ is integrable on E. Let Dy be the set of discontinuities 
of f. Then \(Dy) = 0 by the Lebesgue Characterization of Riemann Integrability. Since 
¢ ' a one to one continuously differentiable function so that Ay-1 # 0 on ¢(U) by the 
Inverse Function Theorem, it follows from Theorem 12.59 that \(¢@~1(D f)) = 0. For any 
point p€ E\¢ 1(D f) we know that ¢ is continuous at p and f is continuous at ¢(p) since 
o(p) ¢ Dy because ¢ is one to one, and since |Ag| is also continuous we know that fo¢|Ag| 
is continuous at p. Hence, if Dyogja,) is the set of discontinuities of fo ¢|Ag| in & then 
AD fog|Ag|) = 9 80 fo Ag] is integrable on E. Note that for a rectangle R in interior of 
the range of ¢, since ¢! is also a one to one continuously differentiable function so that 
Aj4-1 # 0 on the open set ¢(U) by the Inverse Function Theorem, so by Theorem 12.72 it 


follows that f |Ag| = Rc = Vol(R). 
1 
f+ifl Ifl= 


o-1(R) 

Next, observe that f = (f ze fl) (Ae fy where both 5 and 5 ui are non- 
negative and integrable. Hence, if we can prove the theorem for non-negative functions 
f+ fl and fl= f which would show | fof = 

2 op) 


then this will prove the result for 5 


[es é Ag anid ie ue i — [es o g|Ag|. From this, by subtracting 


the second from the first of these integrals, we would have f= | fo d|Ag| (even 
(EF) E 

if f takes on negative values). Thus, it is sufficient to prove the theorem for non-negative 

functions f. 


We assume that f is non-negative. Let € > 0. Choose 6; > 0 so that if S' is a set whose 
diameter is no more than 6; and S intersects ¢(F) then R; C ¢(U). This is possible (by 
the Lebesgue Number Lemma) since ¢(F) = ¢(E) is compact and contained in the open 
set (U). We can also choose 62 > 0 so that if H is a grid on a rectangle containing $(F) 


with |H| < 62 then all upper sums, upper inner sums, upper outer sums, lower sums, lower 


inner sums and lower outer sums of ¢(£) with respect to grid H are within ¢ of f by 
o(E) 
Theorem 12.49. Let 5 = min{61, 69}. 


Let {Gi}i<i<, be a grid on a rectangle containing ¢(£) so that |G| < 6. Let V = 
U o '(R;). By Theorem 12.72, we know that i el a 1 = Vol) 
Ri€S(0(E),G) eee 


for any Ri € G. So, U(f,G@)= SY) M,Vol(R,) = M; [. Aol 2 
R,€S(4(B),G) R,€S( x *(Ri) 
| fodlAgl = [ fodlAgl. Also, L(f,G)° = ; miVol(R;) = 
RiE5(o(B),G) °F Rs) Ri€1(4(E),G) 
ee mi |Ag| < Ss" [. fo d|Ag| = | f° d|Ag|, where W = 
Rier(g(B),G) 7% (Ri) Ri€l(O(E),G) 2% Rs) Ww 
o-'(Ri) 


Ri€1(P(£),G) 


383 


We know L(f,G)° < L(f,G) since f is non-negative, and that L(f,G) < ~< 
o(E) 


U(f,G) and we know that U(f,G) — L(f,G)° < 2e. Since E C V we know that | fo 
Vv 
¢|Ag| = | f 0 @|Ag|, so | fo @|Ag| < U(f,G). Since W C E and f is non-negative it 
E BE 
follows that L(f,G)° < | fodlAg| < i, fo ¢|Ag|. Thus we see ip fo aldsl, | fe 
w E E o(E) 
IL(f,@)°,U(f,@)], so || ee | $ diAyl| S2de- Sines thie ta-enietovall’e > Owe 
o(£) E 


conclude that : f= ip f 0 ¢|Ag|. This completes the proof. 
o(E) E 


We often want to use polar coordinates to integrate over circles in the plane, or an 
analogue of polar coordinates in three dimensions to deal with triple integrals. The first 
of these is cylindrical coordinates, which is convenient for integrating over cylinders or 
prisms over circular sectors, and which is also helpful for many other shapes like cones 
and paraboloids. In cylindrical coordinates we just add a third variable z which is equal 
to the z coordinate of the original point. Thus, if the projection of a point ((z,y,z) onto 
the xy-plane can be represented as (x,y,0) and in the plane the point (x,y) = (r,@) in 
polar coordinates using the principal polar represenatation of (x,y), then the cylindrical 
coordinates representation for (x, y, z) is (r,@,z). Hence, using the usual representations in 
polar coordinates, we have that « = xcos(6),y = rsin(@), z = z is the one to one (except 
at r = 0 and possibly at 0 = 27 if we map the entire closed cylinder) C! map from the 
rectangle [0, R] x [0,27] x [—h, h] to the cylinder from z = —h to z = h over a disk of radius 
R centered at the origin. Hence, if a region F is enclosed within such a cylinder (which 
any bounded region is) then EF is the image of some region D in the r@z three dimensional 


space under the cylindrical coordinates mapping thus defined. Hence if / | {(e.9,2)dV = 
E 


cos(@) —rsin(@) 0 
Tae f(rcos(@), rsin(@), z)| det J|dV, where J = | sin(@) rcos(@) 0}. Thus, expanding 
- 0 0 1 
along the third row we get | det(J)| = r. Hence, the conversion formula from rectangular 
to cylindrical coordinates is: 


[ff tenaav =f ff scos(0),rsin(e),2\rav 


Thus, the Jacobian factor by which we integrate when we change variables in cylindrical 
coordinates is the same as the factor we multiply by when converting a two variable integral 
to polar coordinates. As with polar coordinates, this can be intuitively described by 
looking at a small rectangle with side lengths Ar, A, Az at point (ri, 6;, 2%) in the r6z 
space, and noticing that this corresponds to a section of a cylinder in the xyz space where 
the base of this solid has area approximately r;A@Ar (just as with the polar coordinates 
derivation if the change in @ is small then the base is approximately rectangular and 
the length of one side is Ar and the length of the other is roughly r;A@ since it is a 
circular section). The height of this section of cylinder is just Az, so the volume of 


384 CHAPTER 12. INTEGRATION IN HIGHER DIMENSIONS 


the resulting cyandnica! section is approximately r;AGArAz. Thus, we would expect 


Jim im ind 2 3 f (ri cos(0;), ri sin(9;), z,)riA@ArAz to approach the integral of 
=1g A. 
f over the original corresponding region that is mapped onto by the rectangles in the sums 


listed, so we would anticipate that these integrals would be equal geometrically. 


Example 12.11. Find the integral ve) x°dV, where E is the the half of the cone 
E 
bounded by z = \/x2+y? and z=4 where y > 0. 


Solution. Vertically oriented cones with points at the origin tend to work well with conversion 
to cylindrical coordinates. In this case, the intersection of the cone and the plane listed 
is along \/x2 + y? = 4, a radius four circle centered at the origin. The portion of that 
projection with non-negative y value is a semicircle (making the angle 0 < 7). For the z 
coordinate over that projection, the smallest value of z would be at \/x? + y2 =r and the 
largest value is four. So, describing these points in cylindrical coordinates in the bounds of an 


4 pr pA 4 pt 
integral we would a / I r? cos?(6)rdzd6dr. This is a h- zr? cos? (0 )|* dodr = 


‘h ih Ar? cos? 0)d0dr— 2 fie arf r* cos? (@dbar=8 [9 arf cos” 0)do— a fs arf cos? 


4 ree ) — 2 5127 = 1287 


©) = 28 5 OB 


Another way to generalize polar coordinates to three coordinate systems is spherical 
coordinates. This system, as the name suggests, offers a way to more easily integrate over 
spheres, regions between spheres, and sections of spheres. Some such sections look more like 
ice cream cones with spheres at the top, depending on the bounds of the variables. For this 
coordinate system we let p = \/x? + y2 + x”, which is the distance from the origin to the 
point (x,y,z) in space. We then let 0 be the same angle as in cylindrical coordinates (which 
is the same @ as polar coordinates for the projected point (2,y) in the plane). Finally, we 
let @ be the (smallest) angle from the positive z-axis to the line segment from the origin 
to (x,y,z). We notice then, using trigonometry, that z = pcos(¢). Likewise, we see that 
r = psin(¢) (where r is the same as it is in cylindrical coordinates, the distance from the 
origin to the point (x,y) in the plane). Thus, x = psin(#) cos(@) and y = psin(@) sin(A). 
This allows us to express the transformation from the p,@@ space to the xyz space using 
x = psin(¢) cos(@) and y = psin(¢)sin(@) and z = pcos(¢). Using the transformation of 
variables formula, we get: 


/ / fr (x,y, 2)dV = : / a f(psin(¢) cos(9), psin(¢) sin(8), p.cos(¢)| det J|dV 


sin(@) cos(0) sin(@) sin(@) cos(¢) 
The matrix J = |—psin(@)sin(@) psin(¢) cos(6) 0 . Expanding along the 
pooae )cos(@) pcos(¢) sin(6) apsnte) 


( 
third column, we get —p” cos(#)[sin?(@) sin(¢) cos(¢) + cos?(8) sin(¢) cos(¢)| 


385 


—p? sin(@)[sin?(¢) cos”(@) + sin?(@) sin?(0)] 
= —p’ cos”(¢) sin(@) — p? sin? (¢) = —p? sin(@)(cos?(@) + sin?(¢)) = —p? sin(¢). This means 
that |det J| = p*sin(¢) (assuming we do not use values of @ that are not in the [0,7] 
interval, which is unnecessary). 

Thus, the formula for converting to spherical coordinates is: 


[| [ tenaav =f ff osin(o) cos(0), psin(o) sin(0), pcos(o)p?sin(o)aV 


As with cylindrical coordinates, an intuitive interpretation can be seen if we consider the 
spherical wedge formed by taking the image of a small rectangle in the p@¢ space under the 
spherical coordinates transformation at (p;,0;,o% we get an image which is approximately 
a rectangle and we notice that the length of edge of the wedge section corresponding to 
the change in the angle ¢ is approximately pA@ throughout the wedge section (thought it 
is slightly longer on the outer edge than the inner edge), and the length of the edge along 
the circular section corresponding to a change A@ is rA@ which is psin(¢)A@, and that 
the depth of the wedge section is Ap, so the volume of the corresponding wedge section is 
approximately p” sin(¢) ApA@A¢@ (it approaches this value as the number of subdivisions 
becomes large) . Hence, the integral should correspond to the limit: 

n m ir. 
Jim dim lim S* 5S F (oi sin(¢x) cos(9;), pi sin(g) sin(4;), 0: co8(bx)) p; sin(op) ApAGAS. 
i=1 j=1k=1 

Attempting to decide which coordinate system to use to evaluate an integral is not always 
straightforward and frequently multiple choices will work well. While the following rule of 
thumb is not true in general, it is often the case that it is usually not worth converting a 
planar integral to polar coordinates unless the region to be integrated over is circular sector 
or a region between two circular sectors (unless it is easy to see that the planar region is 
another nice planar curve in polar coordinates like a rose). For the most part, it there 
isn’t much point to converting to cylindrical coordinates unless the projection of the region 
onto one of the coordinate planes is a nice polar region of the types we just described. 
We typically don’t want to convert to spherical coordinates unless the outer surface of the 
region is on a sphere (and the solid is preferably a region within a spherical wedge or between 
two spherical wedges), in which case spherical coordinates is often better than cylindrical 
coordinates. In other integrals where the projection onto the coordinate axes is a nice polar 
region (such as cones with flat tops, cylinders or paraboloids), cylindrical coordinates tend 
to work better than spherical coordinates. If none of these criteria apply, you are probably 
better off leaving the integral in rectangular coordinates. 


2 pV4—a2 py/4—2?2-y? 
Example 12.12. Find [ i | ay? z*dzdydz. 
2/0 0 


Solution. We see that the region where —2 < x < 2 and 0 < y < V9 — 2? is the upper half 
of the radius two disk centered at the origin. Then, if 0 < z < 4-22 —y? we end up 
with the solid over that half disk inside the radius two sphere about the origin. Since this 
is a portion of a sphere, we would guess that spherical coordinates would probably give the 
nicest transformation. The spherical bounds would have angles @ ranging from zero to 7, 
as normal for the upper half of a disk. For each angle of 6 the angles ¢ from the positive 


386 CHAPTER 12. INTEGRATION IN HIGHER DIMENSIONS 


T 
z-axis range from zero to 3 (coming halfway down to the negative z-axis). The radii at 


each of these angles ranges from zero to two because in a radius two sphere. This gives us 


T mi 2 
i i: i p* sin?() cos?(0)p* sin?(#) sin?(0)p cos?()p” sin(@)dpdodé = 
0 Jo Jo 


: cos? sin2 3 sin? cos? 2 8 = 1 Tw (4)(2)(1) 99 
ee . a | ” coms f eo Heya aera! 9) 
~ 945 


Using the fact that converting to spherical simplifies integration over a sphere, it becomes 
easier to use transformation of variables to integrate over regions bounded by ellipsoids just 
as such transformations made it easier to integrate over elliptic disks once we knew polar 
transformations. 


2 2 
Example 12.13. Let E be the region bounded by the ellipsoid > + ee — 2: Find 


[] [ow —- 


Solution. We set « = 2u,y = 3u and z = 4w to get u?+v?+w? = 1, the unit sphere, meaning 
that this transformation takes the unit sphere onto this ellipsoid. We find the Jacobian 


iw) 


2 0 0 
det |0 3 O| = 24. Thus, if F is the unit ball, the integral becomes [ff 24(4wytav = 
0 0 4 E 
27 T 1 T 1 
c21y(256) [ff (pcos(o))tp? sin(e)dodsdd = (24)(256)(2n) [” cos*(o) sin(d)d6 [pap = 
(24)(256)(2m)(2)( (GA) <1 = 245760 


()3)(1)7 35 


Spherical and cylindrical and polar coordinates with respect to different axes: 


Sometimes we want to use polar coordinates over a coordinate plane yz that we want 
to project onto. This can be done easily by just selecting a variable to correspond to the 
new zx and y (assign y = rcos(@) and z = rsin(@) for instance) and then converting the two 
variable integral to polar coordinates and doing the integral as normal. 

Sometimes we want to use cylindrical coordinates over a solid E which is easier to project 
onto another plane (say the yz-plane again). We would then let x = x, y = rcos(@) and 
z =rsin(0). It would also have been find to set z = rcos(@) and y = rsin(@). We just have 
to be consistent about replacing the corresponding variable with its transformed form in 
the integrand. 

Sometimes we want to use spherical coordinates and measure the angle from a different 
direction for the @ angle. Again, let’s say we would like ¢ to be measured from the positive 
x-axis, so that p = \/a? + y?4+ 2? again, but now x = pcos(¢), y = psin(¢) sin(6), z = 
psin(@) cos(@). Or, we could have switched that z and y were equal to that would have 
worked just as well. 


387 


In each case, however, we would have to make sure that the integral bounds for describing 
the region correspond to whatever variable definitions we assign. 

Frequently, it is easier to think of this procedure as just interchanging two variables 
everywhere and keeping the variable definitions the same. This alternates the position of 
the solid and switches the variables corresponding to the switched axes in the integrand. In 
this manner, we don’t have to re-define the transformation and instead switch the variables 
themselves. 


Example 12.14. Find the mass of the region E bounded by x = y* +z? and x =1 so that 
z>0, with density function p(x, y, z) = y?z. 


Solution. This is better done with cylindrical coordinates than spherical coordinates. The 
projection of this solid onto the yz-plane is the upper half of a unit disk since y? + 22 = 1 
at the intersection of the two surfaces. 

Let’s proceed by several approaches. First, we will use a rotated version of cylindrical 
coordinates. We will have x = x and set y = rcos() and z = rsin(@) where r = \/y? + 2”, 
and @ is the angle made with the positive y axis measured towards the positive z-axis 
(counterclockwise as viewed from the positive x-axis looking at the yz-plane). The disk 
would be traced out over 0 < 0 < mw and 0 < r < 1. Over that disk the function 


7 1 iL 
is ¢ = y? +27, sor? < a <1. Thus, the integral is . i: 1) ar® cos?(@)dxdrd@ = 
0 JO Jr? 


T 1 2 
| i, pe cos”(0) 
o Jo 2 


1 T 1 wT 1 
1 1 
drd@ = a i r? cos”(8)—r° cos*(0)drd@ = a cos?(6)d9 | re 
Fe 2Jo Jo 2 Jo 0 
1 Dey, ge: op 1 


1 
5 
Pardo = S\NGIGIG — B)), = ae 
Next, let’s use the approach where we re-orient the axes (switching z and z). So, the 
mass listed would be the same as the mass if we took the region E bounded by z = 27+ y? 
and z = 1 so that x > 0, with density function p(x, y,z) = y’z. Because we changed the 
variables in both the integrand and the description of the region, we should end up with 


the same mass as the original solid had. Projecting down onto the ry-plane we have the 


Z pl pl 
right half of the unit disk. This gives an integral | a | i) zr? sin?(0)dzdrd@. The only 
-EB JO Jr? 


difference between this integral and the previous one is the bounds on the first integral, but 


we see that / : sin?(@)d0 = | cos?(@)d@ (so the integral will give the same answer). 
x 0 


2 
Finally, what if we tried to change the cylindrical coordinate system so that x = x 
and set y = rsin(@) and z = rcos(@) where r = \/y2+ 2? and @ is measured as the 
previous transformation of variables listed above. Then the bounds for the region would 


By Ss Ad 
have -5 <A< and the integral would be ie i, / xr sin?(0)dxdrdé, which has the 
a 0 Jr2 


same value as the integral in the previous method. 


So, the bounds change based on our assignment but as long as we remain consistent 
and describe the region in the new variables we have some flexibility about how we reassign 
cylindrical coordinates. 


388 CHAPTER 12. INTEGRATION IN HIGHER DIMENSIONS 


Spherical coordinates are similar. Here is an example. 


Example 12.15. Find | ydV if E is the solid bounded by x? + y* + 22 = 4 andy = 
E 
V x2 + 22, 


Solution. We can see that E is a spherical wedge which is symmetric about the positive 
y-axis. In spherical coordinates, it would be easier to describe this wedge if ¢ were measured 
from the positive y-axis instead of the z-axis. 

First, let’s do the problem by reassigning the spherical coordinates transformation on 
the variables. We would measure ¢ as the angle from the positive y-axis to the line segment 
connecting the origin to each point. We would have y = pcos(¢) and x = psin(¢) cos(@) and 
z = psin(@) sin(@) (we could have reversed the assignments of « and z and this would have 
altered how @ was measured, but as long as our description in the bounds was consistent 
with the assignment it would have been fine). 

With this assignment, we see that when 6 = 0 we are starting where z = 0 and x 
is positive. When @ = x x would be zero and z would be positive, so we can see that 


@ is being measured from the positive x direction towards the positive z-axis direction, 
which is clockwise as viewed from the positive y-axis. The angle @ is zero at the y-axis 


and “ when y = r. Having observed this, we would be able to describe the solid listed as 


27 - 2 oe = 5 
f / / pcos($)p2 sin(d)dpdddd = i 1d0 / (5) sin(24)ad / pap = (ony) ee, 
TT. 


2 


= 


- 4 
0 0 


p 
lon 


As an alternate approach, we will switch y and z variables so that the spherical wedge 
is symmetric about the positive z axis which makes spherical coordinates in the original 
system convenient to use. We would then see the region E as being the solid bounded 


by 2? +y24+ 27 =4 and z = V/2?+y?. This would have 0 < 6 < 27 and 0 < ¢ < “ 

and 0 < p < 2. The integrand y would be replaced by z since we are switching those two 
Qn a 2 

variables. This changes the integral to | [ : | pcos(¢)p? sin(d)dpd¢ddé, which is the 
0 Jo Jo 


same as before. 


389 


Exercises: 


Exercise 12.1. Let f : E > R be integrable on a Jordan region E in R” so that for 
some 7 € {1,2,3,...,n}, if (v1, X2,....,2i,-.,2n) € E then (21, x2,...., -Xi,..-,0n) € EB, and 


Fl DOs aa pect ye Ey ros She) POO Lume A op le 
E 


Exercise 12.2. Give an example of a bounded function f which is bounded on a rectangle 
R that is not integrable on R. Prove f is not integrable. 


Exercise 12.3. Let f,g: R > R be integrable, where R is a rectangle in R". Prove that 
h(x) = max{ f(x), g(x)} is also integrable on R. 


Exercise 12.4. Let f : R — R be continuous, where R is a rectangle in R™, and let 
g(E) C R, where g : E > R™ is integrable and E is a Jordan region in R". Then prove 
that f og is integrable on E. 


Exercise 12.5. Give an example of an integrable function f on a rectangle R in R” so that 
the set of discontinuities of f is a dense subset of R. 


Exercise 12.6. Show that for any natural numbers n and m it is possible to find f: R>R 
which is integrable on a rectangle R in R” and a function g: E — R which is continuous 
so that E is a Jordan region in R™ and f og is not integrable on E. 


Exercise 12.7. Use Change of Variables to integrate the function f(x,y) = x over 
2 2 


(a) The region E enclosed by the ellipse = + 5 = 1 and 


(b) The region E bounded by the lines y—x =0, y—-x2 =4, 8a +y=0, 34+ y =2. 


Exercise 12.8. Let E be a Jordan region in R". Prove or disprove: E®° is also a Jordan 
region. 


390 CHAPTER 12. INTEGRATION IN HIGHER DIMENSIONS 


Exercise 12.9. Prove that any bounded subset E of R” which has only finitely many limit 
points has volume zero. 


Exercise 12.10. Prove that every open ball (and every closed ball) in R” is a Jordan region. 


Exercise 12.11. Show that the countable union of Jordan regions need not be a Jordan 
region, even if the union is bounded. 


Exercise 12.12. Let E be a Jordan region in R". Define the dilation of E by a factor of 
k > 0 to be kE = {kx € R"|x € E}. Prove that kE is a Jordan region and Vol(kE) = 
k’Vol(E). 


Exercise 12.13. Prove the Mean Value Theorem for Multiple Integrals. Let f,g: E> R 
be integrable with g(x) > 0 on the Jordan region E in R”. Then there is a number c € 


linf f(£), sup f(E)] so that cf g¢= i fg. If f is continuous and E is connected then there 
E E 
is a value w € E so that f(w) =c. In particular, there is a number c € linf f(E), sup f(£)] 
so that cVol(E) = i i: 
E 


Exercise 12.14. Let f : E > R be integrable on the Jordan region E in R”, where f is 
f = f(z). 
2) 


1 
continuous at the point z€ E°. Prove that lim ————~ 
si r++ Vol(B,(2) J, ( 


Exercise 12.15. Let f,g: E > R be integrable on the Jordan region E in R”. Show that 


if f(x) = g(a) for every point x€ Q” NE then Ve = [io 


391 


Solutions: 


Solution to Exercise 12.1. Let f : E > R be integrable on a Jordan region E in R” so that 
for some. 7S (M203 och Uf (Mis Pop atess Dey BA CS EE AEN, (Di 9, ig HG Bp, SE, 


Gnd, { (Ris Dopey Ch ney) =] C1 pea SE pe -F rove tak. fof = 0; 
E 


Proof. Since E is a Jordan region, & is bounded so we can find r > 0 so that E C R= 
n 


II: r]. Let ¢€ > 0. Since F is integrable, we can find a k € N so that if we set P; = 
i=l 

2r 4r : ; ‘ 
{-r,-r4 pet | i ,--y 7} for all 1 <7 <n, then the grid G induced by P,, Po,..., Pa 


has small enough mesh so that | i: ae Ss" f(x; )|Ri|| < € regardless of choice of marking 
E 
R,EG 
points x; € R;. Choose the mid points of each rectangle R; € G as the marking points z;. 
Let Hp be the set of rectangles in G containing points with positive jth component, and 
let Hy be the set of rectangles in G containing points with negative 7th component. For 
each R; € H, let Rg) be the element of Hy which consists of the points of R; with jth 
component negated. Then if x; is the mid point of H and X (i, n) is the midpoint of Riv) 
we know that f(x;) = —f(*(,y))- 


Thus, S> f(xP)|Ril= So f&DIRI+ SD flxt,.))/Reay] = 0. Thus, | i fl< 


R,EG R,cH Ro,ny Cin 


efor every €>Oand so f f=, 
E 


Solution to Exercise 12.2. Give an example of a bounded function f which is bounded 
on a rectangle R that is not integrable on R. Prove f is not integrable. 


Proof. Let R be any rectangle in R”. Let f(x) = 0 if x € Q” and let f(x) = 1 otherwise. 
Then for any grid G on R we have U(f,G) = |R| and L(f,G) = 0, so U(f,G) — L(f,G) = 
|R| > 0, which means that f is not integrable. 


Solution to Exercise 12.3. Let f,g: R—R be integrable, where R is a rectangle in R”. 
Prove that h(x) = max{f (x), g(a)} is also integrable on R. 


Proof. Let ¢ > 0. Choose a grid G = {R;}*_, on R so that U(f,G) — L(f,G) < 5 and 
U(g,G) — L(g,G) < . Le Mott) = up f(x) and Mi(g) = supa) and M; = euD) 
then M; = max{M;(f), Mi(g)}. Likewise, if mi(f) = inf f(x) and mi(g) = inf g(x) and 
eS inf h(x) then m; = max{m;(f), mi(g)}. If Mi(f) => Mi(g) and m,(f) > mi(g) then 
M; —m; = Mi(f) — mi(f). If Mi(g) > Mi(f) and m;(g) > m,(f) then M; — m; = M;(g) — 


392 CHAPTER 12. INTEGRATION IN HIGHER DIMENSIONS 


mi(g). If Mi(f) > Mi(g) and m;(f) < mj,(g) then M;—m; = M;(f)—mi(g) < Mi(f)—mi(f). 
If Mi(f) < Mi(g) and He ae mi(g) then Mj; —m; = M;(g) — mi(f) < Mi(g) — mi(g). 


Thus, U(h,G) — eee M; — m;)|Ri| < Ss" max{M;(f) — mi(f), Mi(g) — 
R,EG R,EG 
ma(g)} Ril < S2 (Mi(f) — mi(f))Ril + SS (Mag) — mi(g))|Ral = U(f,@) — L(f,@) + 
Rec Ree 


U(g,G) — L(g,G) < «. Hence, h is integrable on R. 


Solution to Exercise 12.4. Let f : RR be continuous, where R is a rectangle in R™, 
and let g(F) C R, where g: E > R™ is integrable and E is a Jordan region in R". Then 
prove that fog is integrable on E. 


Proof. Let Eg = {x € E|g is not continuous at x}. Since f is continuous, fog is continuous 
at every point where g is continuous. Hence, if Eyog = {x € E|f og is not continuous 
at x} then Eyog C Ey. Since g is integrable, we know that \(£,) = 0, which means that 
\(E fog) = 0 since a subset of a measure zero set has measure zero. Hence, by the Lebesgue 
Characterization of Riemann integrability we know that fo g is integrable. 


Solution to Exercise 12.5. Give an example of an integrable function f on a rectangle 
R in R” so that the set of discontinuities of f is a dense subset of R. 


Proof. Let f(x) = 0 if x ¢ Q” \ {O}. If x € Q” \ {O} then let f(x) = = where q is the 


largest denominator of a coordinate of x when the coordinates of x are written in reduced 
terms. 
Note that for any k € N there are only finitely many elements of Q” \ {0} whose value 


1 a 
if — or larger. In particular, the n-tuples whose coordinates are all — for 1 <i<k are 


k k; 
1 
the only points whose image under f could be nO larger. Let x be a point with at least 
1 
one irrational coordinate. Let « > 0. Choose k € N so that i < €. Choose 0 < 6 so 


1 
that Bs(x) does not contain any x so that f(x) > E Then for all y € Bs(x), we have 


1 
lf(x) -fly)| = fly) -0< k <e. Thus, f is continuous at x. Hence, the discontinuities 


of f are the points of Q” \ {0}, which are countable, and thus the set of discontinuities of 
f has Lebesgue measure zero and f is integrable. 


Solution to Exercise 12.6. Show that for any natural number n it is possible to find 
f: ROR which is integrable on a rectangle R in R” and a function g: I + R which is 
integrable on the interval I so that f(R) CI and go f is not integrable on E. 


393 


Proof. Let R = [[. 1]. Let g(x) = 1 if e 4 0 and let g(0) = 0. Since g has only one 


i=1 
discontinuity, g is integrable on [0,1]. Let f(x) = 0 if x ¢ Q” \ {O}. If x € Q” \ {0} then 
let f(x) = —, where q is the largest denominator of a coordinate of x when the coordinates 


of x are written in reduced terms. It was shown in the preceding exercise that this function 
is integrable. 

The composition go f is zero at every point of R with at least one rational coordinate 
and go f(x) = 1 for all x € Q”\ {0}, which is not integrable since it is discontinuous at every 
because every open ball contains points where go f takes on the values one and zero so for 
every point x it is false that there is a y > 0 so that if |x—y| < y then |go f(x)—gof(y)| <. 


Solution to Exercise 12.7. Use Change of Variables to integrate the function f(x,y) = x 


over 
2 2 


(a) The region E enclosed by the ellipse > + 5 = 1 and 


(b) The region E bounded by the lines y—x =0, y—x2 =4, 84 +y=0, 38a +y=2. 
Solution. (a) We use the transformation x = 2u and y = 3vu so that u? + v < 1 corresponds 


Dies 39 
to the points inside the ellipse ate = 1. The Jacobian of the transformation is ; ; = 6. 


Hence, the integral yy dA = : 6(2u)?d.A, where D is the unit disk. Converting to 
EB D 


1 
(7) = 6a. 
0 


(b) To evaluate im x°dA we set u=y—x and v =3x+y. Solving for x and y gives 
E 


1 p2n 4 
polar coordinates this becomes 24 | / r? cos”(0)d0dr = 24 (>) 
0 Jo 


— 


eae 3 = 
eae pes us 2 This gives a Jacobian of 4 
4 


—1 
4°49 = —. This change of variables 


6 —— 


ml RR] 
ow 


h the integral rf fie Quv + u2)dvdu ae uv? + u2v] du 
changes e integral to —— —(U" — LUU U UV = — — — = 
i Bee 16 Jorcdnecal 64 J, 3 é 

1 4s 1 8 Our Ste Oo sa 

— | ——4u4+ 2u?du = Qu? 4 = ae 

fig sg ge gh ag 9 as 


Solution to Exercise 12.8. Let E be a Jordan region in R”. Prove or disprove: E®° is 
also a Jordan region. 


Proof. Let x € O(E°) and let « > 0. Then B,(x) contains a point y of E° and a point 
z¢ E°. We can then find 6 > 0 so that Bs(z) C B.(x) since B.(x) is open. Since z ¢ E° 
there is a point w of B;(z) which is not contained in FE, which means that B.(x) contains a 
point which is not contained in E. Thus, x € 0(E) so O(E°) C O(E). Since Vol(O(E)) = 0, 
Vol(0(E°)) = 0, which means that E® is a Jordan region. 


394 CHAPTER 12. INTEGRATION IN HIGHER DIMENSIONS 


Solution to Exercise 12.9. Prove that any bounded subset E of R” which has only finitely 
many limit points has volume zero. 


Proof. Let F' be the set of limit points of FE. Let « > 0. Since F is finite we know that 
F has ° volume zero, and that we can find cubes C1, C2,» ...;Cm whose interiors cover F’ so 


that 3 ICi| < = 5 by Theorem 12.27. Then W = E \ U C;, has no limit points because any 


limit spoint of W would be a limit point of FE and hereto a point of F’, but since every 
point of F is contained C? for some 7 we know that no point of F is a limit point of F. By 
the Bolzano-Weierstrass theorem, any bounded infinite set must have a limit point, which 


means that W is finite. Thus, we can find a collection of cubes Ky, Ko,..., Ky covering W 
t 


so that |Ki| < =. Hence, {C;}%™, U {K;}4_, is a cover of E by cubes with the sum of 
=1 i=1 


the volumes of the cubes in the cover less than e. So, Vol(£) = 0. 


Solution to Exercise 12.10. Prove that every open ball (and every closed ball) in R” is 
a Jordan region. 


Proof. We begin by proving inductively that any ball about the origin in R” is a Jordan 
region. First, the boundaries of open and closed balls are the same, so we need only verify 
that a closed ball is a Jordan region, from which it will follow that an open ball is a Jordan 


region. 
If n = 1 the boundary of [—r,r] is {—r,r} which is a set containing two points and 
has measure zero, so [—r,r] is a Jordan region. Assume that B,(0) = D is a Jordan 


region in R®. Then the boundary of B,(0) in R**" is the union of the graphs Git! = 
{(x, \/r? — |x|?)|x € D} and GS*" = {(x, —\/r? — |x|2)|x € D}, both of which are graphs 
of continuous functions over Jordan regions, so the boundary of B,(0) has volume zero and 
B,(0) is a Jordan region by Theorem 12.50. 

Since we have proven that the translation of a Jordan region is a Jordan region, and 
every closed ball is a translation of a ball centered at the origin, all balls (open and closed) 
are Jordan regions in R” for each n € N by induction. 


Solution to Exercise 12.11. Show that the countable union of Jordan regions need not 
be a Jordan region, even if the union is bounded. 


Proof. A single point is a Jordan region, so the points of S = Q” M B,(0) is a countable 
union of Jordan regions which is not a Jordan region (since 0(S) = B,(0), which does not 


have outer volume zero). 


395 


Solution to Exercise 12.12. Let E be a Jordan region in R". Define the dilation of E 
by a factor of k > 0 to be kE = {kx € R" |x € E}. Prove that kE is a Jordan region and 
Vol(kKE) = k"Vol(E). 


Proof. This follows immediately from Theorem 12.65, since kE is just AF, where A is the 
diagonal matrix whose diagonal entries are all k, and det(A) = k”. 


Solution to Exercise 12.13. Prove the Mean Value Theorem for Multiple Integrals. Let 
f,g: E > R be integrable with g(x) > 0 on the Jordan region E in R”. Then there is 


a number c € [inf f(F),sup f(E)] so that cf g= | fg. If f is continuous and E is 
E 


E 
connected then there is a value w € E so that f(w) = c. In particular, there is a number 


c € [inf f(£), sup f(£)] so that cVol(E) = ist 


Proof. Since g(x) > 0 we know that inf f(F)g(x) < f(x)g(x) < sup f(E)g(x). Thus, 
int f(B) | g< | fg < sup FE) f g by Theorem 6.4. ae g = 0 then for any number 
E E 


a 


c it is true that cVol(E) = | f = 0. Otherwise, inf f(Z) < Info < sup f(£). Thus, 
E 


Sr9 


ifce = Je Ig then cf g= | fg. If E is connected and f is connected then by the 
Se9 E E 


Intermediate Value Theorem for R” there is some w € E so that f(w) = c. In the case 


where g(x) = 1 we know that | g = Vol(E), so cVol(E) = | f. 
E E 


Solution to Exercise 12.14. Let f : E + R be integrable on the Jordan region E in R”, 
f=fl2. 


where f is continuous at the point z€ E°. Prove that um. ——_—___ 
+ Vol(B,(z)) J,(z) 


Proof. Let € > 0. Choose 6 > 0 so that if |z— x| < 6 then | f(x) — f(z)| <e. 


Then f(z)—e < f(x) < f(z) +e for all x € B5(z), so (f(z) —€)(Vol(Bs(z)) < a io < 


1 
(f(z) + ©)(Vol(B;(z))) and therefore if 0 < r < 6 then f(z) —e < Vol(B,(z)) a (2) fs 


1 
f(z) +6. Hence, im, menue = f(z). 


Solution to Exercise 12.15. Let f,g: E — R be integrable on the Jordan region E in 
R™. Show that if f(x) = g(x) for every point xe Q" NE then i. f= 
E 


396 CHAPTER 12. INTEGRATION IN HIGHER DIMENSIONS 


Proof. Choose a sequence of grids {G',} on a rectangle R containing E so that {|G,|} — 0. 
By Theorem 12.30, if we choose any markings T,, of G, then {S7,(f,Gn)} > i f and 
E 


{Sr,(9,Gn)} > | g. However, since Q™ is dense in every rectangle in each grid Gy, we 
E 
can choose the markings T;, to consist of only points of Q”. Then Sr, (f,Gn) = Sr, (g, Gn) 
for every n € N. Hence, | g -| f. 
E E 


Chapter 13 


Vector Fields, Curves and Surfaces 


In this chapter, we discuss curves, surfaces, vector fields and integrals along curves and 
surfaces. We will begin with a discussion of curves. 


Curves 


Definition 98 


For any k € N or for k = on, if there is an open interval J containing D and a 
C* function r* : I + R” so that r* restricted to D is r, then we say that r is C*. 
If r is C1 and r'(t) 4 0 for each t € D, and one to one on D®, then we refer to r 
as a smooth curve. We also refer to the trace of C' as a smooth curve if there is any 
parametrization whose trace is C which is a smooth curve. If D = [a,b] then we refer 
to r(a) and r(b) as the end points (the initial and terminal or starting and ending 


points respectively) of C. If r is a smooth curve which is one to one then we refer to 
ras a simple smooth curve. If r is asmooth path with D = |a, b] so that r(a) = r(b) 
then we will refer to C as a smooth closed curve. If C' is a smooth closed curve so 
that r’(a) = r’(b) then we refer to C as a smooth simple closed curve. A regular curve 
is a C™ curve r so that r’ is never zero (but a regular curve need not be one to one). 
The trace of a curve is an arc if r has domain [a, }] and r is one to one. The trace is 
a simple closed curve if r has domain [a,b], r(a) = r(b) and the function is otherwise 
one to one (meaning that if x g {a,b} and x # y then f(x) F f(y)). 


It should be noted that what we have defined to be a curve is usually not all that useful 
except for determining things like path connectedness. It is known that there are space 
filling curves for instances, which are continuous images of closed intervals that are two 
higher (they fill a square an n-rectangle and do not look like curves at all). However, we 
sometimes want to talk about such things too, so we are using the term ”curve” to be this 
most general sort of curve, in keeping with the notion of a path in topology. Mostly, we care 
about smooth curves. Without the r’(t) 4 0 requirement there may be no tangent lines to 


397 


398 CHAPTER 13. VECTOR FIELDS, CURVES AND SURFACES 


a curve and the graph may have cusps. Without the one to one requirement, notions of arc 
length become confusing (they end up referring to distance traveled along a curve, possibly 
moving back and forth, rather than the length of the curve itself). 


Definition 99 


We say a set C in R” is an arc if C = f({a,6]) for some closed interval [a, 6] and 


some continuous one to one function f : [a,b] ~ R”. A simple closed curve is the 
one to one continuous image of a circle. 


Since an arc and a simple closed curve do not have to be differentiable, they are usually 
not the objects were are interested in if our goal is to explore calculus notions on an object. 
However, there are results which are about arcs or simple closed curves that are relevant to 
our discussions. 


Definition 100 
Let C be the one to one curve r : [a9, @m] — R”. If there are simple smooth curves 


On = r1((ao, @1]), C2 = f(a) eo ee On = erential) so that r(z) = bo} 
whenever x € [aj_1, a;] then we say that C is a piecewise- smooth simple curve and 


we will refer to the set {C1,...,Cm} as a decomposition for C, and each C; as a 
component curve of C’. If we amend the definition of piecewise -smooth simple curve 
so that 71(a1) = Tm(bm) (and r is otherwise one to one, meaning that if x ¢ {ao, am} 
and y € [a9,@m] then r(a~) # r(y)) then we say C is a piecewise-smooth closed curve. 


The reader should be aware that there are some differences in the specifics what what 
some of the terms above mean from one text to another. For example, in some books a 
smooth curve may only have to continuously differentiable, or a smooth curve may have to 
be regular. These definitions aren’t consistent. 


Example 13.1. (a) Explain why the parametrization r(t) =< a+ R cos(t),b+ Rsin(t) >, 
0<t < 2m is a parametrization for a circle of radius R centered at (a,b), traversed once 
counterclockwise. 


(b) What curve would be traversed by r(t) =< acos(t), bcos(t) >, for 0 < t < 27? 
(c) What curve would be traversed by r(t) =< cos(t), cos(t),¢ >, for t € R? 


Solution. (a) We observe that (a — a)? + (y — b)? = R? cos*(t) + R? sin?(t) = R?, which we 
know is the equation for a circle of radius R centered at (a,b). For each point p on the 
circle, there is a unique angle @ € [0, 27) so that a radius R line segment from center (a, b) 
at angle 6 would end at p, which means that p =< a+ R cos(0@),b+ Rsin(@) >. In the 
interval [0,27] the only repeated point would be at 0 and 27, so the circle is traversed only 
once. The circle is traversed counterclockwise since larger angle values measured from the 


399 


positive x direction correspond to points further along the curve in the counterclockwise 


direction. 
2 22 


(b) In the case of r(t) =< acos(t), bcos(t) >, we note that ate = cos”(t)+sin?(t) = 1, 
so the curve traced out is an ellipse. 

(c) Since a? + y? = 1 just as it was in (a), the points of the parametrized curve lie on 
the cylinder with radius one about the z-axis. The third component increases at a constant 


rate, which gives a trace which is a helix twisting upwards. 


It is worth noting that there are multiple parametrizations for the same curve. For 
instance, the circle in example (a) can be traversed by r(t) =< a+Rcos(2t), b+ Rsin(2t) >, 
0<t<q7, which is a parametrization tracing out the same curve counterclockwise at twice 
the speed of the former parametrization. We would like to formalize notions of speed 
and acceleration of a particle moving along a parametrized path, motivating the following 
definitions. 


Let r(t) =< a(t), y(t), z(t) > be a parametrized curve defined on an interval J. We 
should observe that, for each point to € J, me r(t) =< 20, yo, 20 > if and only if ne en) = 
x0, im y(t) = yo and jim 2(¢) = 2. Likewise, for a two component parametrized curve 
r(t) =< a(t), y(t) > defined on an interval J, for each point to € J it is the case that 
jim r(t) =< 09, 0g Sib ae x(t) = xp and jim y(t) = yo. 


oe ae - 
Eenpleiao maine UO 
t0 t t #2 


Solution. We can just take the limit of each coordinate and use L’Hospital’s Rule. For 
sin(3t) i 3.cos(3t) 


the first coordinate we get lim ———~ = lim ———~— = 8. In the second coordinate, 

: /-_ t0 t t0 ik f 

=e, = De ~sin(t 

lim ae lim = 0 and in the third coordinate lim eon) = lim ny = 
t0 t t0 +0 t? t+0 «= 2t 

= t 1 1 
lim Sant) = ——. Thus, the limit is < 3,0,-—= >. 
t30 2 2 2 


Example 13.3. Find the derivative of r(t) =< t,e',t? +1> at (0,1,1). 


Solution. Setting the first coordinates equal to each other we see t = 0. r’(t) =< 1, e', 3t? >. 
Setting t = 0 we see r’(0) =< 1,1,0>. 


It is often helpful to know the length of a curve. This can help us to determine how far 
an object moves on when it traverses a curve over a given time interval. 


400 CHAPTER 13. VECTOR FIELDS, CURVES AND SURFACES 


Definition 101 


Let r(t) on a < t < b bea regular curve. Let {P,} be the standard sequence of 


partitions of [a,b] into n evenly spaced subdivisions. We define L = Jim 2 Dot [r( 


r(tj-1)| to be the arc length of the trace of this regular curve. If C has a fan which 
is finite then we say that C’ is rectifiable. 


Theorem 13.1. Let r: [a,b] > R® be a smooth curve with trace C. Then C is rectifiable 
b 

with arc length L = / |r! (t)|dt. 
a 


In particular: 
If a ) =< a(t), y(t) > overa <t <b then the arc length of C is given by the formula 


LS ver ))? + (y(t)? dt. 


If r(t) =< x(t), y(t), z(t) > over a < t < b then the arc length of C is given by the 


formula L = ii J/ (a! (t))2 + (y'(£))2 + (2/(t))2 dt. 
ify= fle) vera <2 t het b= [vir f'(x))?dz. 


Furthermore, there is a 6 = 0 so that if P = — w>tn} is a partition of [a,b] with 


P| <6 then| f° In! (t ce (ti) — r(tia)|| <e. 


Proof. We will prove the theorem for a curve r(t) =< a(t), y(t), z(t) >, from which the 


(t 
result will follow for a two coordinate curve (by setting ot .=O). 
First, note that L = [ |r’(t)|dt exists since |r’(t)| is continuous. 


Let Py = {to, 81524, t,} be the standard nth partition of [a, b] into equal length subintervals 
with each t; — tj; = At. In each subinterval [t;-1,¢;], by the Mean Value Theorem, we 
can pick ¢x),t,t© so that a/(t)At = 2x(t;) — x(t;-1), y/(t)At = y(t) — y(ti-a), 
Z(t) At = z(t;) — z(t;-1). The distance from r(t;_1) to r(t;) is 
J (x(ts) — e(ts-1))? + (y(ta) — y(ts_-1))? + (2) - 2(-1)? 
= Fare) ae? + (YEAH? + (2 AP))P2(AH? 


_ Ve" ! t) 2+ — teas 

mt a (u,v,w) = V/a'(u (v)? + z'(w)?. Note that G is continuous since each of 
a’,y’, 2’ are continuous. ae on ue closed and bounded set [a,b] x [a,b] x [a,b] we know 
that G is uniformly continuous by Theorem 10.35. Choose a 6 > 0 so that if |x — y| < 6 


) 
then |G(x) — G(y)| < —*_ tn is large enough that At < —— it follows that each 
—a 


V3 


\(ti, ti, ti) — (4), 4 1 < $0 which means that 


a 


| 
Ido fore)? + (WAM)? + (2 AM))2At - 
i=l 


401 


Jat)? + WE)? + (2 FAM =1 L(G (ti, ti, ti) - GU 1 APA] < So [Etats ts) 


i=1 i=1 


We know that Jim Ss / (a! (ti)? + (y' (ti)? + (2'())2 At = 
i=1 


b n 
/ Ir’(é)|dt = Land tim SO V(t)? + WW)? + E(H)P- 
a i=1 


sy (a(t)? + (YQ)? + (2?) PAE = 0. Hence, lim 5° Ir(ti) - r(ti-1) 
1=1 


a=1 


= lim Ve t (t\”)) 24 (y'(t))2 + (21(¢))2At = L. Thus, C is rectifiable with arc length 


ua ff |r’ (t)|dt. 


In particular if we parametrize y = f(x) on the interval [a,b] with the parametrization 


r(t) =< t, f(t) > fora < t < bthen we have L = [vet (f’(t) pat= fi V1+(f'(x))2dz. 


To verify the last part of the theorem, we note that none of the steps in the argument 
above depended on the partition points being equally spaced, only on the mesh of the 
partition being sufficiently small. 


t 
We know that s = | |r’(t)|dt exists for any smooth curve s, which is the arc length of 


the curve C pam e. by r(t) measured from the point r(a). Since |r'(t)| is positive and 
continuous, we know that s(t) is an increasing function of t. Thus, s(t) is invertible, with 
inverse t(s). We say that the function r(t(s)) or just r(s) (where it is understood that r(s) 
refers to rot(s)) is a parametrization of C with respect to arc length. If we fix the value of 
a then this parametrization is unique. 


Definition 102 


Let r(t) be parametrized curve defined on an open interval I. The velocity 
v(t) = r'(t) for this curve, or the velocity of an object whose position is r(t), and 
the acceleration of an object whose position is r(t) is r’(t) = a(t) (assuming these 
derivatives exist). We say that |v(t)| = v(t) is the speed of a particle whose position 
at time t is r(t). Thus, v(t) = |r’(t)]. 

The integral of the vector valued function r(t) =< x1(t), vo(t), ...an(t) > over [a,b] 


b b b 
is< f xi(t)at, | ra(t)dt, f Hons 
a a a t 


If the curve C parametrized by r(t) is smooth then s(t) = ii |r’ (t)|dt is the arc 


a 
length function measured from the point r(a) = p. If t(s) is the inverse of s(t) then 
r(s) denotes r(t(s)) is the parametrization of C with respect to arc length measured 


402 CHAPTER 13. VECTOR FIELDS, CURVES AND SURFACES 


| from the point p. | 


Example 13.4. Find the integral of r(t) =< e',2t,4 > over (0, 2]. 


2 2 2 
Solution. The integral is < | eat, | zat, | Adis = <¢' t,t =, =< e?-1,4,8>. 
0 0 0 


Using parametric equations it is possible to find formulas for area under a curve and for 
the rate of change of one variable with respect to another. 


Theorem 13.2. Let r(t) =< 2(t), y(t) > be a C! parametrized curve defined on an open 
interval I. If x'(to) 4 0 for some to € I then for some 6 > 0, if we restrict r(t) to the 


domain (to — 6,to + 6) then y is a differentiable function of x on that curve and “4 exists 


x 
‘ys 
fc 2y — (2a)' (to) 
y 0) Furthermore, the second derivative ou 20 
x! (to) dx? a! (to) 


at x(to) and is equal to 


Proof. First, we note that we can choose a 6 > 0 so that if |t—to| < 6 then |x’(t) —2"(to)| < 
1 1 il 

5|2"(to)| which means that either 2’ (t) > 52 (to) > Oora'(t) < 52 (to) < 0 on (to—d, to+d). 
Thus, x(t) is strictly monotone on (to — 6,to + 6). Hence, for every t € (to — 6,to + 4), it 
follows that there is only one y value y(t) so that (x(t), y(t)) € r((to — 6,to + 6)). This 
means that y is a function of x. Furthermore, since x(t) is continuously differentiable, x(t) 


has a continuously differentiable inverse function t(x) so that t/(r) = AO by the Inverse 
xr 


Function Theorem (where t¢ is the point that maps to x under z(t)). Using the chain rule 
and the Inverse Function Theorem we see that z is the derivative of y(t(x)) with respect 
x 


to x, which is y'(t(x))t'(x) = /i—— 


PON 
y'(t) .. dy 
To prove the last part of the theorem, we note that since (8) is 7m) on (to —6, to +9), 
d 
by the first part of the argument (substituting 7) for y(t)), it follows that the rate of 
a 
‘®y 
d (aa) (to) 

change of i with respect to x is Se at (to). 


d 
It is important to notice in the preceding discussion that the value of ae obtained only 


2 
exists locally at to, meaning that it is only for the function restricted to the (t — 6, tp + 0) 
interval (or any interval on which z(t) is strictly monotone which contains to). If there 


is only one t value at which z(t) = x(to) or if “Y is the same for all ¢ values so that 


x(t) = «x(to) then there is such a thing as the tangent line slope at x(to) for r(t), and 


403 


d 
7, (#(to)) is the tangent line slope. On the other hand, if the graph of r(t) passes through 


d 
the same point at two different times t;,t2 and the slopes 7 exist locally at both values 
ic 


d 
of t but “Y (¢(t1)) fe Y (a(t,)) (the values of a for portions of the curve at t values near 
x x 


t, and te TeapeelNiely) then the graph of r(t) does not have a tangent line at (x(t1), y(t1)). 
In some texts, it is said that the graph has two tangent lines at the point but we will use 
the convention that seems more common in later advanced calculus courses, and say that 
the curve restricted to two subintervals of its domain has two different tangent lines but the 
entire graph has none. Another way of saying this is: 


Definition 103 


If the parametrized curve r : (a,b) > R? has a trace which, if intersected with 
some B,(r(to)), is also the graph of a function y = f(x) or = f2(y) for differentiable 


functions f; or fo then the trace of the curve is locally a differentiable function graph 
near r(to), and a tangent line to the trace of r((a,b)) exists. In the case where 


d 
x’ (to) = 0 the tangent line is vertical (but still exists even though ~ does not). 
: ae ‘ — dx x' (to) 
By simply switching variable labels, we can likewise find ay 0) a aT for the 
0 


parametrization restricted to an interval of t values about to on which y(t) is monotone, so 
we can identify vertical tangent lines when x’(t) = 0, for instance, just as horizontal tangent 
lines occur when y(t) = 0. 


Example 13.5. Let r(t) =< t? + 2t,e7" >. Find the tangent line to this curve, and also 


2 
find ot at (0, 1). 


Solution. This point occurs when t = 0. Note that 2/(0) = 3(0? +2 > 1 so near t = 0 it 


is true that y is a function of t. In fact, 2’(t) > 0 for all t, which means that there is no 


other value ¢ so that x(t) = 0, so there is a tangent line to the curve (whether the domain 
20 2 


is restricted or not) at each point. The derivative is in 32D 2 1 when ¢ = 0. 
at 
P (3t?+2) (4e?") —2e7# (6t) 
2 2 
The second derivative is ie Ss . At t = 0 this is also equal to 1. 
z 


: dy . . 
Since =” is the slope of the tangent line, we can use the usual point-slope form of the 


z 
line to get the tangent line, which is y — 1 = (1)(a — 0). 


We can also use parametrizations to find the area between a parametrized curve and an 
axis, or enclosed by a parametrized simple closed curve. We discuss the latter more when 
we talk about Green’s Theorem. 


404 CHAPTER 13. VECTOR FIELDS, CURVES AND SURFACES 


Theorem 13.3. Let r(t) =< 2(t), y(t) > be a C! parametrized curve defined on an open 
interval I so that on a closed interval [a,b] C I it is true that 2'(t) > 0 on the interior 


a(b) b x(a) 
of I. Then [ gt(e) de = / y(t)a’(t)dt. If x'(t) < 0 on [a,b] then | gt(g) de = 


() 
2 , y(t)a! (t)dt. 


Proof. First, note that since y(t)z’(t) is continuous, it is integrable. As in the preceding 
theorem, we know that y is a function of x on [a, 6] since x(t) is strictly monotone, so if t(x) 


is the inverse function of x(t) then y(t(x)) is a function of x. Using the substitution 7 = x(t) 
x(b 


b b (b) 
we have that dx = 2’ (t)dt. Hence, x’ (t)dt = x(t))a' (t)dt = x))dzx. 
that (1)at | y(t)a! (t)at | y(t(a(t)).' (tat | Hee) 


In the case of x/(t) < 0 we would have bounds where x(a) > 2x(b), and switching them 
negates the integral. 


Example 13.6. Find the area enclosed by the ellipse r(t) =< acos(t), bsin(t) >, 0 <t< 
27. 


Solution. Since x’ (t) < 0 on the interior of the interval 0, sh and the portion of the ellipse 


in the first quadrant is traced out by the parametrization restricted to this subinterval. 


From this, it follows that the enclosed area is —4 i : bsin(t)(—asin(t))dt. Using Wallis’s 
0 


2 1 
formula we obtain 4 | * ab sin?(t)dt = Jab>* = ab. 
0 


Definition 104 


Let r(t) be a smooth curve. The unit tangent vector is T(t) = rey This is a 
unit vector in the direction of the curve. 
t , : 
The unit normal vector N(t) = wy This vector is perpendicular to the 


T(t 
direction of a curve, and the curve es into the normal direction as well as 
accelerating along the direction of the tangent vector. All of the acceleration of r(t) 
is a sum of its acceleration into the tangent direction and its acceleration in the 
direction of the unit normal vector. The fact that T(t) and N(t) are perpendicular 
just follows taking T-T = 1 and differentiating both sides to get 2T’-T = 0, so T 
is perpendicular to T’. 
The unit binormal vector is B(t) = T(t) x N(¢). 


The curvature k of r(t) is lag 


Let a(s) be a regular curve parametrized with respect to arc length. The torsion 
T(s) is the number so that B’(s) = T(s)N(s). 

The normal plane to a curve r(t) at time t = to is the plane perpendicular to 
r’ (to) which passes through the point r(tg). The osculating plane is the plane which 


405 


is perpendicular to the binormal vector at time t = tp and passes through the point 
r(to). The rectifying plane is the plane through the same point which is perpendicular 
to the unit normal vector. 


Initially, it might seem like this definition of curvature could depend on which parametrization 
we use, but any parametrization with respect to arc length gives the same curvature 


|T"()| 


definition. In fact, we will show this definition of curvature is the same as ——~— = 


|r’(t)| 
ee i'(t) x 2"(8) 
roe vA)" formula to fi 
|r’(t)|? roe ormula to find curvature 


Curvature is a measure of how sharply a space curve turns. A radius R circle has curvature 


. Typically, we will want to use the 


1 ; : : ‘ . ol 
—, so if the curvature of a space curve is & at a given time, then a circle of radius — would 
kK 


bend as sharply as the given curve at that point along the curve. 

Note that in some books the torsion we gave is the negative of the torsion rather than 
the torsion. That there even is such a number requires a theorem, and is shown below. 

Note that the unit tangent, normal, and binormal vectors, as well as the curvature, are 
local properties of a parametrized curve. Thus, if a curve intersects itself later then at a 
given point on the trace of the curve there could be two unit tangent vectors, one at the 
time value the first time the curve passed through the point, and another unit tangent 
vector at another time. Hence, we talk about the unit tangent, normal, binormal vectors 
and curvature at a value t of the parameter rather than at a point on the trace of the curve 
itself. 


Theorem 13.4. Let r: I > R® be a smooth curve, where to € I. Let u(s) : J > R® be 
a parametrization of r with respect to arc length, where s is the arc length measured from 
some point r(to). Then u(s) is a regular curve and u'(s) is the unit tangent vector to u(s) 
at § = So. 


t 
Proof. Since s(t) = / \r’(t)|dt by the Fundamental Theorem of Calculus (first form) we 
to 


note that s’(t) = |r’(t)| > 0, which means that s is continuously differentiable and increasing, 
and thus one to one. It follows from the Inverse Function Theorem that t(s), the inverse 
function, is also increasing and continuously differentiable. Since u(s) = r(t(s)), which is 
a composition of two C® functions, it follows from the chain rule that u’(s) = r’(t(s))t’(s) 


is C®. Also, |u’(s)| = |t’(s)||r’(t(s))| > 0, so u is a regular curve. Finally, by the 
Inverse Function Theorem we know that t/(s) = Fs) Thus, |u’(s)| = |t/(s)||r’(¢(s))| 
s!(t(s 
1 i uw’ (so) 


FCO) (t(s))| = ruta)” (e)) = 1. This means that T(so) = = 


Theorem 13.5. Let r: I > R® be a smooth curve, where ty € I. Let u(s) : J > R® 


t 
be a parametrization of r with respect to arc length with s(t) = |r'(t)|dt, and ty € I. 
to 


406 CHAPTER 13. VECTOR FIELDS, CURVES AND SURFACES 


Then the curvature « at that point (meaning K(t1) or K(s(t ) depending on whether we use 


T'(t1) _ |r(t ) x r(#)| = w"(s1)| 
rel ror | 


dT 
parametrization r or w) is ds —(s(t1)) = 


dT 
Proof. First Iz, (s(t) is the definition of «(t;) and also the definition of «(s1) with respect 
8 


to the parametrizations r and u respectively. Parametrizing with respect to t we see from 
a (ty) T(t 

(ty), _ (E(t) 
4 (ty) Ir’(t1)| 
that the curvature is independent of parametrization choice. 


dT 
Since u/(s1) = T(s1) we have that |u’(s1)| = zn (su)! Se 


= k«. Note that it follows from this 


dT 
Theorem 13.2 that Ize (st) = 


Let v(t) = |r’(t)|. Then at any time t is is the case that r’ = Tv, which means that 

" — T’v+Tv. Recall that T and T’ are perpendicular (explained after the definition of 

unit normal vector earlier). Taking the cross product, and using the fact that if two vectors 

are parallel their cross product is the zero vector and if two vectors are perpendicular then 

the norm of their cross product is the product of their norms, we see that r x i= v(TxT’), 
Fae. al 


[r’(t) > 


and since T’ = kv, it follows that |r’ x r”| = v?|T’||T| = |v®|«, so K = 


The following relationships between unit tangent, unit normal and unit binormal vectors 
tell us quite a bit about the behavior of curves and are useful in differential geometry 
developments. 


Theorem 13.6. Frenet formulas: Let u(s) be a regular curve parametrized with respect to 
arc length. Then: 

(a) T(s) = k(s)Ns) 

(b) There is a number T(s) (the torsion) so that B'(s) = T(s)N(s) 

(c) N(s) = —k(s) T(s) — (s) B(s) 

T'(s) 
|T’(s)| 

(b) We know B(s) is perpendicular to N(s), T(s) (since the cross product of two vectors 
is perpendicular to each vector, and B(s) = T(s) x N(s)). Likewise, since B(s) is a unit 
vector we have B(s)- B(s) = 1. Differentiating both sides we see that 2B’(s) -B(s) = 0, so 
B’(s) is perpendicular to B(s). Thus, B’(s) is in the osculating plane. Using the derivative 
formula for cross product we have B’(s) = (T(s) x N(s))! = T’(s) x N(s) + T(s) x N“(s). 
Recall from (a) that T’(s) is parallel to N(s), and is therefore perpendicular to B’(s). 
Hence, T(s) - B’(s) = 0. Thus, B’(s) is perpendicular to both T(s) and B(s) which means 
B(s) is parallel to N(s) and there is a number 7 as described. 

(c) By Theorem 10.4 part (g) we see that N(s) = B(s) x T(s) and TS )x N(s) = —T(s). 
Thus, N’(s) = B’(s) x T(s)+B(s) x T’(s). Since B’(s) = 7(s)N(s), B/(s) x T(s) = —TB(s), 
and since T’(s) = k(s)N(s) it follows that the second vector in this sum is B(s) x TMs = 
—k(s)T(s) since B(s) x N(s) = —T(s). 


, so the result follows. 


Proof. (a) k(s) = |w"(s)| = |'T’(s)| and N(s) = 


407 


Example 13.7. Let r(t) =< 3cos(t), 3sin(t), 4t >. 


(a) Find T(t), N(t), B(¢). 
(b) Find «(t), the curvature at time t. 
(c) Find the normal plane, osculating plane and rectifying plane to r(t) at the point 


(—3,0, 47). 
Solution: i) Sula: Seoa) 4 
r(t —ssin(t cos(t 
T(t) = = 

(@) TO = Ge a SS Som 

T'(t) 
N(t) = =< —cos(t),—sin(t),0 >. 

|T'(t)| 

i jk 
—3sin(t) 3cos(t) 4 Asin(t) —4cos(t) 3 
B(t) = T(t N(¢) = det = 
(et NG ate se es 
—cos(t) —sin(t) 0 

_ le’) xr") 
OKO = "OP 
| < —3sin(t),3cos(t),4 > x <—3cos(t),—3sin(t),0> | _ 

125 7 
i j k 
|det | —3sin(t) 3cos(t) 4] | 
—3cos(t) —3sin(t) O}] — |< 12sin(t),—12cos(t),9>| 15 3 
125 - 125 ~~ 125. 25° 


Note that though this curvature is constant, in general, one typically gets different 
curvatures at different points along a curve. 


(c) The point (—3, 0, 477) occurs on the curve r(t) when t = 7. So, we plug z into r’(t) to 
obtain r’(7) =< 0,—-3,4 >. Thus, the normal plane is a plane perpendicular to < 0,—3,4 > 
and containing point < —3,0,4a >, which has equation 0(x + 2) — 3(y — 0) + 4(z — 477) = 0 
or 4z — 3y = 167. 


4 3 
Since the binormal vector B(z) =< 0, =, = > is perpendicular to the osculating plane, 
the osculating plane is 0(a + 2) + 4(y — 0) + 3(z — 4m) = 0 or 4y + 3z = 127. 
Since the unit principle unit normal vector is perpendicular to the rectifying plane, and 
N(z) =< 1,0,0 > we have that the rectifying plane is 1(x + 2) + 0(y — 0) + 0(z — 47) = 0 
or v= —2. 


Theorem 13.7. Let r: I — R® be a regular curve. Then the acceleration a(t) = r'(t) = 
_y _ Ht): rt) - I7(t) x "(0)| 
ar(t) T(t) + an(t)N(t), where ar(t) =v = =e and an(t) = Kv* = Ol 
Proof. As in the proof of Theorem 13.5, we note r’(t) = (T(t)v(t))’ = T’(t)o(t) + T(t) v(t) 
Since T’(s) = KN(s), by the Chain Rule we see that eee = T'(s(t))s'@) =P a= 
o(t)N(t). Thus, T’(t)o(4) = x(v(4))2N(4), where «(u(¢))? = = FOP rors" aon 


| 
Similarly, we see that r’(t) - r”(t) 7m r'(t). ae )o(t) + T(t) (t)) = v(t) T(t) - (T’(t) v(t) + 


T(t)v(¢)) = o)v(t). Thus, v(t) = ear 


408 CHAPTER 13. VECTOR FIELDS, CURVES AND SURFACES 


Hence, a(t) = r"(t) = ap(t)T(t) + ay(t)N(t), where ar(t) = ¢ = —+~— and 


Let r: 


the tangential component of acceleration and to an(t) 


normal component of acceleration. 


Notice that the acceleration of a curve is the sum of vectors in the direction and in 
the principal unit normal direction to the curve, both of which are in the osculating plane, 
making the osculating plane, in this sense, the plane that best fits the motion of the curve 
locally. 


Example 13.8. Find the tangential and normal components of acceleration to the curve 
miH=20 21 Satta 1. 


Solution. r'(t) =< 2t,2,0 > and r”(t) =< 2,0,0 >. Plugging into the formulas at t = 1 we 

22,20 552,005 4 [2.2.05 % =< 2.0.0 >| 
oO | <2,2,0>] 2/2 v2 eel an) | <2,2,0>] 
== Late | sg 


2. 
2/2 


Note that the tangential and normal components of acceleration are usually not equal; 
these values just happened to be the same in the preceding example. 


Finally, we include a theorem to demonstrate how Frenet formulas can be useful. 


Theorem 13.8. Local canonical form. Let r: I — R® be a regular parametrized curve 
with no singular points of order one where 0 € I. If we use the coordinate system where 
r(0) is the origin and the direction of the x-axis is T, the direction of the y axis is N and 
the direction of the z axis is B where r(s) = (x(s),y(s), z(s)), then there are functions 
Ry, Ry, R,: 1-4 RB so that 


Pe 
(i) x(s) = s— —s° + Ry 


6 
/ 
(ii) y(s) = ae 4 a i R, 
(iti) 2(s) = -=s8 +R, 
‘ _ R: 
and fg ey =0 


s—0 53 so0 S$ s—0 53 


409 


2 3 
Proof. By Taylor’s Theorem we know that r(s) = r(0)+sr'(0)+"(0)+ ar" (0)+R where 
R 
jim, Als) = 0 and R = (Rz, Ry, Rz). Since we know that r’(0) = ¢(0) and r”(0) = «(0)N(0) 
— 
and r’”(0) = «/(0)N(0) + «(0)N’(0) = «’(0)N(0) + —«?(0)T(0) — &(0)7(0)B(0) we can plug 
2 3 


S S 
these in to the preceding equation to simplify to r(s) —r(0) = sT 4 5 KN+ F (x’N — Kt 
KT)+ < Ry, Ry, Rz >. Factoring out T, N,B and combining terms, in the given coordinate 
system this simplifies to the equations given above. 


Note that we can reparametrize any curve so that a given point is the image of 0 and 
the interval J contains zero, so this technique can be used fairly generally. 

This form of a curve lets you readily see some properties relating to the planes we 
mentioned. For example, when t¢ is small the x component is largest (having the lowest 
power term), which means that the curve always crosses through the osculating plane, and 
likewise the y coordinate is positive near a point, making the curve locally in the direction 
of the unit normal from the rectifying plane near the point r(0). These sorts of observations 
help us to understand the shape of the curve. 

This development may not obviously explain why we bothered with the torsion as a 
separate term. To motivate torsion, we mention that if the curvature is positive on an 
interval and the torsion is known, and both are differentiable, then these two functions 
uniquely determine the curve on the interval in question up to a rigid motion. This result 
is called the Fundamental Theorem of the Local Theory of Curves. 


Line Integrals 


Definition 106 


A vector field F on R? is a function F : R* — R? (or from R? to itself), where the 
images are the associated vectors corresponding to the points in the images rather 
than the points themselves (using our convention that a point or vector in Euclidean 
space are interchangeable depending on context). We say that F is a gradient vector 
field or a conservative vector field if F = Vf for some function z = f(z,y). If 
C is the image of r : [a,b] + R”, which is a C’ curve, we say that C is oriented 


from r(a) = p to r(b) = q. Let f be a function taking the image of r(t) into R. 
Then we define the line integral of f(x,y) along C with respect to arc length to be 


b 
ib fés= if f(r(t))|r’(#)|dt. For a vector field F we say the line integral of F along 
C a 
r'(t) 


C is i F-dr= i F . Tds. Since T(t) = ——, we can rewrite this expression as 
Cc C 


; zo) 
[F =a — / F(r(t)) - x (t)dt. 


We often express path integrals along vector fields in terms of sums of path integrals with 
respect to individual variables. We define the integral of function z = f(z, y) along C! curve 


410 CHAPTER 13. VECTOR FIELDS, CURVES AND SURFACES 


b 
C with respect to a particular variable, say variable x, to be [ fdx= / f (r(t))a’ (t)dt, and 


a 
define integration with respect to other variables similarly. Thus, if F =< P,Q > we can 


b b 
depreaent y Paes [ F(r(t))-r/(t)dt = / P(r(t))2'(t) + Q(r(t))y’ (Hat = [ Pde +Qdy. 


Likewise, if F =< P,Q,R > then i F.dr = i: Pdx + Qdy + Rdz. 


Line integrals were developed in the early eighteen hundreds and they are helpful in 
analyzing problems involving fluid flow, gas flow, work, force and electrical current. For 
example, if each vector assigned to a point represents the velocity of a fluid at that point 
then the vector field could model the flow of a river. If the vectors indicate force due to 
something like repulsion by a magnetic field or the effects of forces induced by gravity in 
Newtonian models (in a relativistic model gravity is technically not a force, though it results 
in forces being applied by one object against another) then the vector field could indicate the 
directions and magnitudes of the forces. Integrals of functions can indicate accumulation or 
loss along a path. For instance, if someone is moving along a path on a road with variable 
amounts of gravel and is collecting gravel along the way then the line integral could indicate 
the total amount of gravel collected moving along the path. Alternately, it could represent 
total charge acquired along a plate or total heat energy absorbed moving along a path or 
total mass of a wire along a path (thinking of the path as accumulating mass in amounts 
based on the linear density or mass per unit length along the path). 


Example 13.9. Evaluate / yds, where C is the path along the curve y = \/4— 22 from 
Cc 

(2,0) to (—2,0). 

Solution. First, we parametrize the path C,, which is r(t) =< 2cos(t),2sin(t) > over 0 < 


poi: Then f ytds= [° asin®(e) y/Asin®(e) + 4.005%(0)at = (8)(2)(5)() = 4 


Example 13.10. Let F(z,y) =< 2y,27 >. Find the line integral of F along the line 
segment C' from (0,1) to (4,2). 


Solution. First, we parametrize the path C, which is r(t) =< 4t,1+t > over0 <t <1. Then 
1 if 1 
: Pedr = [ P(a(t)) (oat = [ <2+ 2t,16t? >-<4,1> ar= | 8 + 8t + 16¢7dt = 
C 0 0 


0 
ee. 
. o 


16t? 


Bear? 
ar 


Line integrals (or path integrals) have concrete physical representations. We mentioned 
a few of those above. For instance, if a function f is non-negative and represents the mass 
per unit length along smooth curve C then the integral of f along C' represents the mass 
of the curve. Another visible way of looking at a line integral of a function is that if the 
curve is in R? then we can think of the function values as indicating the heights of points 
directly over (or the depths directly below) the curve in a third dimension (like a curtain of 


411 


variable height). The area of the resulting curtain would be another way to visualize this 
scalar line integral. 

If a vector field F represents the force acting on a particle as it traverses the curve C’ 
then the integral of F along C' represents the work done by the force field F as the particle 
moves along the curve C. 


Theorem 13.9. Fundamental Theorem for Line Integrals. Let C be the C! path r(t), 
a<t< bin R"”, and let f be a C! function on an open set U containing C. Then 


[vf ar= Fr) — F(a). 
Furthermore, if Q, parametrized by w : [a,b] + R", is a piecewise smooth path from a 


point s to a point e that lies within U then i Vf -dr= f(r(b)) — f(r(a)). 
Q 


b b 
Proof. We know [ Vj°dc= - Vf(r(t)) -r'(t)dt = | (f (r(t)))'dt using the Chain Rule. 


But then, by the nundementél Theorem of Calculus, this is equal to f(r(b)) — f(r(a)). 
Since @ is a piecewise smooth path, we know that Q consists of a finite sequence of 

paths from s to some point x; followed by a path from x; to some point x2 and so on until 

we finish with a path from a point x,, to e, the ending point of the path. Since smooth 


paths are C! we can see from the argument above that | Vf-dr = (f(x1) — f(s) + (f(xe- 
Q 
f(x1)) +... + (f(e — f(*m)) = fle) — f(s) as desired. 


Example 13.11. Let C' be the path consisting of the line segment from (0,0,0) to (1,6, 2) 
followed by the line segment from (1,6,2) to (8,0,4), followed by the line segment from 
(8,0,4) to (1,1,3). Let F(x, y, z) =< 2xe¥, x7e¥,2z >. Find [ F.- dr. 
C 

Solution. First, if F = Vf for some potential function f then we know that f; = 2zxe¥, fy = 
ze, and f, = 2z. Thus, it must follow that f = x?e"+91(y,z) and f = 27e¥ + go(z, z) and 
f = 2°+493(2, y) for some differentiable g1, 92, g3. We see that setting g1(y,z) = go(x,z) = 2” 
and g3(x,y) = 27e¥ we would have the function f(x,y, z) = x?e” + z” in each case, which 
makes this a potential function for F. From the Fundamental Theorem of Line Integrals it 


follows that | F . dr = f(1,1,3) — f(0,0,0) =e+49. 
Cc 


Definition 107 


Let F be a continuous vector field on an open, connected set D containing a 
piecewise smooth path C' which begins at point a and ends at b. Then we say that 


| F - dr is path independent in D if for any other path C, which begins and ends at 
o 


412 CHAPTER 13. VECTOR FIELDS, CURVES AND SURFACES 


the same points as path C, the line integral | 


F-dr= i F - dr, in which case we 
Ci Cc 


b 
also write the integral as F - dr (since the integral value is only a function of the 


a 
end points and not the path). 


Theorem 13.10. (a) Let F be continuous on an open connected region D in R” and let 
/ F. dr be path independent in D. Then F = Vf for some C' function f. 


(b) Let F be continuous on an open connected region D. Then F is path independent if 
and only if F is a gradient vector field. 


Proof. (a) We first give the proof assuming D C R? and F =< P,Q >, where P,Q are 
(x,y) 

continuous functions from D into R. Let (a,b) € D and set f(x,y) = | F - dr and 
(a,b) 

choose an open ball B so that (x,y) € B C D with (21,y) € B for some 2; < x. Let Ci 


be a piecewise smooth path from (a,b) to (x1, y) and let C2 be the horizontal line segment 


(2,y) 

from (1,y) to (z,y). Then | F-.dr = | F-dr +f F-dr. The first of these 
(a,b) Cy C2 

path integrals is a constant that does not depend on zx and along the second path the 


variable y is constant. Hence, we have f,(x,y) =0+ ° | P(r(t))a’ (t) + Q(r(t))y'(t)dt = 
LI Co 
=| P(r(t))a'(t)dt. Using the parametrization r(t) = (t,y), 71 < t < y for C2 this 
C2 
O x 


simplifies to — | P(t,y)dt = P(x,y) by the Fundamental Theorem of Calculus. 


ON Jas 
To show that f, = Q you instead choose (#,y,) € B with y; < y, and then the the 
argument is similar. You just set Cy to be a path from (a,b) to (x,y), and set C2 to be 


the path r(é) = (z,t), y. <¢t < y and get that f,(z,y) =0+ dy 2 Q(r(t))y'(t) + Odt = 
a fy 
Oy Jy 

The argument is similar in R”, letting a € D and f(x) = . F-dr except that if D C R” 


Q(r(t))dt = Q(a,y). 


we create an additional path for each coordinate derivative, and all the points we have thus 
far listed have n coordinate entries instead of two. More specifically, we take ball B.(x) Cc D 
and let C, be a path from a to (#4, ®2,..., © — 6, Xi41,---, Ln) = Xo, where 6 < «. Let Cz be 
the line segment parametrized by r(t) = (1, 29,...,t,2i41,---,Un), where 7; 6 <t < aj. 
We can extend this path so that r is defined on (x; — 6,2; +6) so that we can differentiate 
the path integral at x; = t. When differentiating with respect to x; we can treat the other 
variables as fixed and for points z = (21,23, %3,...,U, Li41,---;n) where |u — x;| < 6 we 


have f(z) = | F-dr +f F - dr, where C2(z) is the path (x1, %9,...,t, Ui41,--) Un) for 

C1 C2(z) 
v,—6<t<u. Then of 
Ox; 


has derivative zero. The second is 


= (f F.dr+ | F - dr),,. The first integral is constant and 
C1 C2 


" F(r(t)) -r/(t)dt = F,(r(u)). Thus, at u = 2;, the 


Ox; 0 


413 


partial derivative is F;(x) as desired. 
(b) If F = Vf, a continuous vector field, then by the Fundamental Theorem of Line 
Integrals we know that for any piecewise smooth path C' from a to x, the path integral 


F . dr = f(x) — f(a), meaning that F is path independent. Thus, by part (a) it follows 


Cc 
that F is path independent if and only if F is a gradient vector field. 


From this theorem, we see that path independence and a vector field being conservative 
or a gradient vector field on a connected open set are all equivalent assuming a vector field’s 
component entries are continuous. 


Example 13.12. Let F(x, y, z) =< Qre!"” Qyx2ev”, 2z+a > be the force acting on a particle 
at point (x,y,z) in Newtons. Let C be the path r(t) =< t,—2t,2t > ond <t <1 from 
(0,0,0) to (1,—2,2). Find the work done by the vector field acting on a particle traversing 
the path C, assuming the distances indicated by the variables are measured in meters. 


Solution. The work is the integral of the vector fiels along C, which is / F - dr. Initially, 
we might hope this vector field is conservative. If F = Vf then aitidiffeventianing the 
components of F should give f. Thus, f = ae! + gi(y,z) and f = ae + g(x, z) and 
f = 274224 93(x,y) for some differentiable g1, 92,93. This could almost be done in the 
sense that if we set f = ve! + 2? then Vf =< Qre!, 2yx%el” , 22 >= F,. However, we 
have a leftover zz summand in the third coordinate that cannot be compensated for. There 
is no way to just add a function of y and z in the f = aed 4 gi(y, z) and get xz becacuse 
xz is not a function of only y and z. Thus, this vector field is not conservative. However, we 


can write F = F; + Fo, where Fo =< 0,0,x >. Since F-dr = F,-dr-+ | Fo-dr we 
Cc Cc Cc 
can still use the Fundamental Theorem of Line Integrals to evaluate the first summand, and 


this will make our work easier. We note that | F,- dr = f(1,—2,3) — f(0,0,0) = e? + 4. 
Cc 
1 


1 
We use the usual formula to take | F.-dr= | 0.0.8 S +210 9S der) Sh 
Cc 0 0 
Hence, ‘ F.dr =e? +441=e?+5 Joules. 
Cc 


Depending on the vector field, this process may not help. In many cases it is faster 
to just use the formula for a line integral along a vector field directly rather than find a 
potential function for only part of the vector field. However, if, when looking for a potential 
function, you find that there isn’t one but that potential function is very close, it may in 
some cases be worth checking whether writing the original vector field in this way might be 
beneficial. 


It may not be obvious from the definition, but if we have two paths and one ends where 
the next begins then there is a path consisting of the first path followed by the second. 
It is also true that the parametrization does not affect the path integral (of either type). 
And reversing the orientation of a vector field integral negates the integral along the path, 


414 CHAPTER 13. VECTOR FIELDS, CURVES AND SURFACES 


whereas reversing the orientation of an integral of a scalar function over a path does not 
affect the integral. 


Independence of parametrization: 


Theorem 13.11. Let C be the smooth curve r: [a,b] > D CR", and let —C be the smooth 
curve r_ : [a,b] + D of reversed orientation defined by r(t) = r(a+b—t). Then: 


(a) Let f: DR be a function so that [ fds exists. Then | fds =| fds. 
Cc -—C Cc 


(b) Let F be a continuous vector field on D. Then / F.dr= -{ F. dr. 
-—C Cc 


b 
Proof. (a) By definition we see that | fds = / f (r(t))|r’ (t)|dt. If we use the substitution 
C a 
b 
u=a+b—t then t =a+b—u and du = —dt, which means that i f(r(t)) |r’ (dt = 
a b = 
-| f(r(a+b—u)|r'(a+b—u)|du = ‘i f(r(a+b—u)|r'(a+b—u)|du = | fds. 
b a —C 


b 
(b) We have i) F-dr = / F(r(t))-r’(t)dt. If we use the substitution u = a+b—t then 


C 
t = a+b—u and du = —dt, and which means that [ Pedr =— [ F(r(at+b—u))-r(a+b— 
Cc b 


b 
u)du = i F(r(a+b—u))-r’(a+b—u)du. Now, (r(a+b—u))’ = —r’(a+b—u) by the chain rule, 


b 
which means that [ P(r(a+b—w))-1'(a+b—ujdu=— [ F(r(a+b—u))-r’(a+b—u)du. 
a -—C 


Theorem 13.12. Let r, : [a,,b1] > D, and let rz : [a2,b2] + D be paths. Then there is a 
path corresponding to r; followed by rz, meaning that there is a path rz : [a,b] > D so that 
for some point c € [a,b] the restriction r3\\q.-) of 73 to [a,c] is has an image which is the 
trace of r,, and the restriction T3|{q,4) to [c,b] has image which is the trace of r2, and the 
orientation of T3\ja,¢ 1s the same as that of r, and the orientation of T3\j-4) 18 the same as 
the orientation of ro. 


Proof. Let a = a, and let c = by and let b = b; + b2 — ag. We define r3(t) = rj (t) if 
t € [a1, 61] and we define r3(t) = re(t + @2 — b1) if t € [b1, by + bo — ag]. Then rs||fq, 6.) = 41; 
and for every a2 +t € [ag, be] there is exactly a corresponding value b; +t € [b1, 6; + b2 — ag] 
so that r3(b; +t) = re(by + t+ ag — b1) = ro(a2 +1) for all 0 < t < be — ag and vice versa. 
Thus, the trace of rg and the trace of r3|jp, 5, +6.—a2] are the same. We note that the starting 
and ending points of r3|j5, 6,+0.—a,) are the starting and ending point of r2 as well, so the 
orientations are preserved. 


415 


Theorem 13.13. Let C, be the smooth curve r, : [a1,61:] ~ D C R", and let C2 be the 
smooth curve rT : |a2,b2] + D, where Cy and Cz have the same orientation (starting and 
ending point). If f : D + R is continuous and F is a continuous vector field on D then 


fds = fds and F.dr= F.- dr. 
C2 


Ch Cl C2 


Proof. First, note that the requirement that f and F and the derivatives r; and rg be 
continuous guarantees that all of these integrals exist since all continuous functions are 
integrable and we know that f(r(t))|r’(¢)| and F-r’ are continuous. 

Let C be given by r : [a,b] + R” is a smooth curve then we have shown we can 


t 
parametrize the curve with respect to arc length as S(t) = / |r’(u)|du defined on [a, 0], 


where S(t) is the length of the curve r restricted to [a, t]. This is a one to one increasing 
function whose derivative is |r’(t)|, so S is C1. Hence, the inverse t(S') of S(t) exists, is 
increasing and is C! on (0, LZ], where L is the length of C. We could then parametrize r 
with respect to arc length as Cg, which is r(t($)) : ‘ L] — R", and the image would be 
the trace of C (so the trace of C and Cg are the same) 


Thus, [tars [re f(r(t))|n'(t a jas= f(r sye(syyias = f° re(tsy) 


i f (r(t(S))) |r’ (t())|t/(S)dS' since t(.$) is increasing and has positive derivative. Setting 
u = t(S), we have du = t'(S)dS, so the integral becomes [se f (r(u)|r’(u)|du = i fds. 


Thus, replacing r with r; and then with rg, we see that fds= fds. 
Ci 
An integral along a vector field is just an integral of the function 7 = = F -T, and is 


therefore a particular case of the preceding result, meaning that i, F-dr= 7 F -dr. 
C1 C2 


Combining the results above we can see that if C, is any smooth path from a to b and 


Cy, is any smooth path with the same trace from b to a then | F-dr = -{ F - dr for 
C1 C2 


any continuous vector field F, and for any Ct function f it follows that i fds = fds. 
Ci C2 


Flux and Circulation: 


Definition 108 


Let C be a smooth closed curve parametrized by r : [a,b] > R” and F bea 
continuous vector field. Then i F - dr is the circulation of F along C’. If C is in 
Cc 
< y(t), —2'(@) > 


he aad r'(t) 22 x’ (t), y(t) >and FE =< PQ >, and nit) = Val)? + (y'(t))? 


r’(t(S))||#(S)|ds 


416 CHAPTER 13. VECTOR FIELDS, CURVES AND SURFACES 


b 
then the fluz through F in direction n(t) is it F(r(t)) -n(t)ds = i. SOG oe EGE 


2 HOMOs a 


Alternately, if n(t) = then the flux through F in direction n(t) 


: a) aaa)? 
e / TON Sera ao Ode ora. 


If the vector field F' represents velocity of a fluid then the circulation represents the 
what the amount of fluid flowing around a curve per unit of time approaches as the fluid 
approaches velocity as it is along the line. The circulation is not actually the quantity 
of water flowing around the curve per unit time, however, because F will vary away from 
the curve. The units are distance times distance per unit time (velocity) however, so it is 
measured in square distance units per time unit. If, instead of a curve, we had a tunnel 
around the curve of a diameter to admit one square unit of the fluid and there was no 
compression in the fluid due to the curvature (imagine the tunnel is in the fourth dimension) 
then the circulation could indicate the actual amount of fluid per unit time that is moving 
along the closed curve if the velocity of all water in a perpendicular cross section to the curve 
were the listed vector field values. Instead, it represents the rate a volume of fluid passing 
around the curve for fluid in a tube of small radius divided by the area of a perpendicular 
cross section of the tube approaches as the radius of the tube approaches zero. It is an 
idealized notion of rate of fluid flow which is only locally accurate. 

Flux through a curve is only defined in R? (flux through a surface will be defined in R® 
later). We think of a curve as having two normal directions. If F represents the velocity of 
a fluid then the flux through the curve represents the net amount of fluid passing through 
the curve per unit time in the direction of the indicated normal vector n (the fluid passing 
from the side of the curve that n is pointing away from to the side of the curve that n is 
pointing towards). Unlike circulation, flux measures the area of fluid passing through the 
curve per unit time (it is not a limit of such rates for fluid very near the curve). 


Example 13.13. Let C be the circle x? + y? = 1 traversed once counterclockwise starting 
at the point (1,0). Let F =< 2y,y+1 > be the velocity of the fluid. Assuming variable 
distances are in meters and velocity is in meters per second, find: 

(a) The circulation around C. 

(b) The flux of F passing outwards through C. 


Solution. Let C be parametrized by r(t) =< cos(t), sin(t) >. 
(a) We check that F is not conservative. Antidifferentiating both coordinates, if Vf = F 
we would need f = 2yx + gi(y) and f = ry ++ go(x). There is no go(x) we could add 


to add another xy so F' is not conservative. We will just go directly to the definition to 
20 


F. dr =€3"< 2sin(t),sin(t) +1 > - < —sin(t),cos(t) > dt = [ —2sin?(t) + 


Cc 
sin(t) cos(t) + cos(t)dt = —27 square meters per second is the circulation using Wallis’s 
formula. 
(b) Using the parametrization given, r/(t) =< —sin(t),cos(t) > and the unit normal 


2a 
direction that points out through the circle is < cos(t),sin(t) >. So, the flux is | a 
0 


A417 


20 
2sin(t),sin(t) +1 > - < cos(t),sin(t) > dt = | 2sin(t) cos(t) + sin?(t) + sin(t)dt = 7 
0 


square meters of fluid per second. 


The interpretation of a negative value of circulation is that the fluid is moving against 
the direction of the orientation of the curve. A negative value for flux would indicate that 
the net fluid flow would be through the curve in the opposite direction to the orientation 
listed. 


Green’s Theorem 


It should be noted that all simple closed curves separate the plane into two open subsets 
(the complement of the curve is two disjoint open subsets), one of which is bounded and 
both of which have boundary equal to the simple closed curve. This result is called the 
Jordan Curve Theorem, and its proof is messy, so we will only prove Green’s Theorem only 
for some special cases for which the Jordan Curve Theorem is not needed. We only refer 
to the bounded open set in this separation as the region bounded by the curve. Recall that 
we have already defined orientation of an arc. We can also assign an orientation to a simple 
smooth curve (even if it has no end points) which is just a choice of tangent vector (either 
T(t) or —T(t)) over the curve (which must be a continuous function of t for a smooth 
curve). Such choices determine the end and start of a curve if that curve is also a path. We 
address orientation for smooth closed curves in a similar manner. 


Definition 109 


If C is a smooth curve than an orientation of C’ is a C! function O : C > R” 
assigning a vector to each point p of C' a tangent vector to C at p of unit length. 
If r : J + R” is a smooth curve with trace C then the orientation induced by r 
is the function O(r(t)) = T(t) for all t € I. If C is the piecewise-smooth closed 
curve r : [a,b] > R? whose trace is the boundary of the bounded open set E, then 
we say that C is positively oriented or oriented counterclockwise if for every point 
r(x) € C at which C is differentiable, all points sufficiently close to and to the left 
of r(a) are contained in E. More precisely, if T(a) =< a,b > then there is some 
€ > 0 so that if0 < t < e then r(x) +t < —b,a >€ E. Likewise, C is negatively 
oriented or oriented clockwise if for every point r(a) € C at which C is differentiable, 
all points sufficiently close to and to the right of r(x) are contained in E. More 
precisely, if T(z) =< a,b > then there is some € > 0 so that if 0 < t < ¢€ then 
r(z) +t < b,-a >€ E. If w: [a,b] > R” is a smooth curve W, and C is a smooth 
curve which is a subset of W then the orientation of C' induced by W is the orientation 
of C with parametrization w : [c,d] > R", where [c,d] = w-1(C). 


For a smooth path we can also define an orientation to be a choice of starting and ending 
point of the path determined by the parametrization. In other words, if r : [a,b] > R” is a 


418 CHAPTER 13. VECTOR FIELDS, CURVES AND SURFACES 


smooth path then its orientation is from r(a) to r(b). The corresponding induced orientation 
is T(t). 

Once a choice of unit tangent vector for a smooth curve is made at one point on the trace 
of a smooth curve C, that choice determines the choice of unit tangent vector at all other 
points for a given orientation. The reason is that if we let O be the orientation of C’ and 
r : I + R” be any parametrization for C, then either O(r(t)) = T(t) or O(r(t)) = —T(t) 
for every t € I. Hence, if we define W(t) = O(r(t))- T(t) then W is a dot product of two C' 
functions and is C1 and therefore continuous. Since W has a range contained in {—1, 1}, it 
is only possible for W to be continuous if W is constant (because the continuous image of a 
connected set is connected), so either W(t) = 1 for all t € J, meaning that O(r(t)) = T(t) 
for all t € IJ, or W(t) = —1 for all t € J, meaning that O(r(t)) = —T(t) for all t € I. Hence, 
the choice of induced orientation is determined by the unit tangent vector assigned by that 
orientation at any given point, there are only two possible orientations for a given smooth 
curve, and the choice of starting and ending points for a smooth path corresponds to the 
listed orientation of C’. 


We first prove Green’s Theorem for regions of type I and II. This can be used to prove 
Green’s Theorem for any finite union of regions of type I and II which only intersect along 
their boundary curves, and this is sufficient to allow us to use Green’s Theorem for all of 
the regions we are interested in addressing in this text. 


Theorem 13.14. Special case of Green’s Theorem. Let C be a positively oriented closed 
path bounding a region E which can be expressed as both the region between y = gi(x) and 
y = go(x), where gi(x) < go(x) overa < x < b, and as the region between x = hi(y) and 
x = ho(y) wherec< y < d. Let F=< P,Q > be aC’ vector field over a connected open 


set containing E. then ff a2 oP aa ~ [Fr dr. 


Proof. First, we express C' as sequence of four paths Cy : ri(t) =< t,gi(t) >, a<t <b, 
followed by C2 : ro(t) =< b,t >, We < t < go(b), followed by C3 : i ) =< -t, g3(—-t) >, 
—b<t<-a, followed by C4: rele —t >, —go(a) <t ao —gi(a 


(a) 
We will show that J hae —dA = ‘ —Pdx and /[ —dA = [ow First, note 
that ff a a= ihe OF aya = fr P(x, g2(x)) — P(x, gi(x))dz. 
E Oy gi(a) 


PRS iieeeAl Le iis | PEsied i PO pide | Pairs . Pane 


1 C2 C3 C4 


We can see that 2x’(t) = 0 along Cy and Cy which means that i Peas =) = 
C2 


b b 
P(a,y)dz. Also, P(e, yidz= / P(t, gi(t))a' (t)dt = / P(t, gi (t))dt and Pee 
C4 C7 a a C3 


a b 
P(-t, g2(—t))(—1)dt. Setting u = —t this integral becomes | P(u, go(u))du = -{ P(t, ga(t))dt. 
—b b a 


b 
Thus, | —P(2,y)dz = / P(t, ga(t)) — P(t, gi (t))dt as desired. 
C a 


dQ ho(a) dQ d 
The process for Q is similar. As before, || $e ae a= f° a Fe tedy = ) Q(ha(y),y)— 
E hi( c 
Q(hi(y), y)dy. 


419 


This time we divide C into paths C1 : ri(t) =< hi(—t),-t >, —d < t < —c followed 
by Cog : re(t) =< t,e >, hi(c) < t < ho(c) followed by C3 : rg(t) =< ho(t),t >, 
ec <t < d and finally Cy : rg(t) =< —-t,d >, —ho(d) < t < —hy(d). As before, 


[ ee [ Qe. ways [ Qe. y)dy+ [ Qe.n)dv+ [ Qe dy. Since y(t) =0 


on paths C2 and C4 we know that ; Q(a, y)dy =0= / Q(x, y)dy. Next, ] Q(a, y)dy 
C: C3 


2 C4 
d 


d 
=| Q(ho(t), t)(1)at jit and Qle,v)dy = | hal oe (tae = | —Q(ha(t),t)dt 


Thus, / [F —dA= [ Qdy and the proof is aaee 


Example 13.14. Find the path | F.dr if C is the unit circle traversed once clockwise 
Cc 
and F =< x + 3y + ye®,e” — cos(y) +2? >. 


Solution. Since the orientation of the curve is clockwise, we negate the integral we get 


from Green’s Theorem. If D represents the unit disk then this gives us — / a 
D OL 


P 
Tak = 7 F - dr which is // (3 + e”) — (e* + 2x)dA. Using the symmetry of the 
Cc D 


unit disk, we see that / 3dA = 37 (since integrating a constant over an area gives 
D 


the constant times the area) and —2ardA = 0 (since both the unit circle and the 


function are symmetric about the origin). Alternately, converting this last integral to polar 
20 1 20 


20 1 
becomes | | 2r cos(@)drdé =) cos(6)a8 far =D since | cos(@)d0 = 0. Thus, 
0 Jo 0 0 0 
the desired path integral is 37. 


Definition 110 


We will refer to a set E C R? as being piecewise type one and two, if it is a union of 
finitely many regions R),..., Ry of both type one and type two so that for each i > 1 


i-1 


it is true that Ry U R; #9, and is a piecewise smooth simple curve r : [a, b] > R? 
j=l 


i-1 
which is contained in O(R;) 0((_J Rj), so that r((a,b)) C (Ri U R;)°. 
jel 


Note that these are not standard terms. We use them in this text because they 
encapsulate all the regions we are planning on proving Green’s Theorem for. This is broad 
enough class of regions that it will include nearly any curve you are likely to wish to use 


420 CHAPTER 13. VECTOR FIELDS, CURVES AND SURFACES 


Green’s Theorem for without specifically going out of your way to find a counterexample. 
Essentially, a piecewise type one and two region is one where you can keep adding extra 
type one and two regions on a boundary edge until the union is the entire region. 

It is also possible to prove that r((a,b)) would be interior to Rj UR; (we didn’t have to 
make it part of the definition). Thus, parts of the definition are redundant, but because we 
do not wish to prove these things we have stated a definition with more strict conditions. 

It may be worth noting that some statements of Green’s Theorem found in calculus 
texts are excessively optimistic. By reason of the Jordan curve theorem, any simple closed 
curve is the boundary of a bounded open set & which can then be shown to be connected 
and simply connected, but we do not wish to prove that theorem in this text because it 
would be a tangent that would take us too far afield. Some statements of the Jordan curve 
theorem state that every smooth closed curve bounds a region FE so that the conclusion of 
Green’s Theorem holds. However, it is known that we can have a smooth closed curve which 
is the boundary of a bounded region E in R? so that E is not even a Jordan region (so we 
cannot integrate over FE). This was determined by W. F. Osgood in 1903 (Proceedings of 
the American Mathematical Society, volume 4). Our version of Green’s Theorem is not as 
general as it could be, but the form given below is sufficient for our purposes. 


Theorem 13.15. Let E be a piecewise type one and two region in the uv-plane. Then the 
boundary of E is a piecewise smooth closed curve C. Let F =< P,Q > be aC! vector 


0 OP 
field over a connected open set containing E. Then [Le Oe =. Dy —dA = i F-dr if C is 
Cc 
positively oriented. 


Proof. We will proceed inductively, appending regions one at a time. 

First, assume that Ry; and Rp are closed regions whose boundaries are the traces of 
positively oriented piecewise smooth simple closed curves C; defined by rj, : [a,, bi] + R? 
and C defined by rz : [az,b2] 3 R? respectively, having the property that for any C’ 


P 
vector field F < P,Q > it is true that [ F-dr = IL FS OF Se —dA and F-dr = 
Cy Ri Ox Oy C2 


6) OP 
/ [SF og = ayo Assume further that Ry R2 = C, the trace of piecewise simply smooth 
Ry O 


curve C3 ne by r3 : [a3, b3] + R? and that RU Rp is mg a oO r3((a3, b3)) is in the 
interior of Ry U Rp. Let E = Ry UR». We wish to show that Ll? — =aA = yh F -dr. 

W know R{N R5 = 0. We can assume that rz is oriented so aaa the left diseetion is in 
the direction of the interior of R, (re-labeling which curve is C; if necessary). Hence, C3 
is a portion of the positively oriented piecewise smooth boundary C; and is oriented in the 
same direction (meaning the unit tangent directions are the same as in r2 at each point on 
the trace). We refer to rz : [a3,b3] > R? as being defined by rg (¢) = r3(a + b — t), which 
has the same trace and opposite orientation and is also a simple piecewise smooth curve. 
Since the interior of R, is left of the unit tangent direction along the curve R, it follows 
that the interior of Rg is right of that direction, or left of the tangent direction of r,;. Hence 
rs ([a3, b3]) C Cy and has orientation which is the same as that of C2 on C3 (meaning the 
unit tangent directions are the same as in rg at each point on the trace). 

Next, we note that if x € O(R, U R2) then either x € 0(R1) or x € O(R2) which means 
that x € C)\r((a3, bs)) or x € C2\r((a3, bg)) since no point of r((a3, b3)) is in the boundary 
of Ry U Re (all such points are in the interior). Likewise, if x € Cy U C2 \ r((a3, bs)) then 


421 


for every € > 0 the ball B.(x) contains a point of either R; or Rg. If x ¢ C) then we can 
choose € small enough so that B.(x)M R2 = @ since Rz is closed, and we know that B(x) 
contains a point p which is not in R; and since p ¢ Rz either we see that x € O(R; U Re). 
Likewise, if x € Cz \ C; then x € 0(R; U Ro). 

By re-parametrizing if necessary, we can choose r; so that the images of the end points 
a, and 6; under rj are not contained in r3([az, b3]). Choose q so that r1(q) = r3(a). Then 
for any € > 0 we can find a 6 > 0 so that if « € (¢—6,q+ 4) then |ri(q) — ri(x)| < €. Note 
that q is the first point of [a1,b;| whose image is contained in the trace of rg. Hence, there 
are points in C, \ r3((a3, b3)) within any positive distance of r3(a) since r3((az,b3)) does 
not intersect ri(q — 6,q). Likewise, there are points of C; \ r3((a3, b3)) that can be chosen 
within any positive distance of r3(b3). This means that ri(a3) and r3(b3) are limit points 
of C; \ r3((ag, 63)) and therefore of Cy U C2 \ r3((az3, b3)). The boundary of a set is always 
closed, so ri(a3) and r3(b3) are elements of 0(R1 U R2). Hence, the boundary or Ri M Re is 
exactly Cy U C2 \ r3((a, )). 

Next, we will show that C, U C2 \ rs((a,b)) is a piecewise smooth closed curve. There 
is a piece decomposition of C; consisting of component curves sj : [aj = 20,21] > R? ,82: 
(v1, 22] > | eae ee [fmm = bi| > R?. There is some j so that g € ep Si elo TE 
we choose w so that r;(w) = r3(b3) then there is also some k& so that w € [axp_1, 2x). 
We let P, be the piecewise smooth simple curve whose piece decomposition consists of 
$1 : [%o,%1],S2 : [@1,@9],...,82 : [wj-1,q]. We let Py be the piecewise smooth curve whose 
piece decomposition consists of the component curves sz : [w, XK], Sk41 : [Tk,2e+1], +) Sm : 
[m—1,Lm]. We note that the ending point of P, is the same as the starting point of P,, 
and P, U Py = C; \r3((az3, 63)). Similarly, we can find piecewise smooth curves P2 and P3 so 
that the starting point of P3 is r3(a3) which is the terminal point of P,, the ending point of 
Py is the starting point of P3; and the terminal point of P3 is the starting point of Py, which 
is r3(b3). We note that P, intersects P2 at only its end point and P, at only its initial point, 
and likewise P: intersects P; at only its initial point and P3 at only its end point, and P4 
intersects P3 at only its starting point and P; at only its ending point. Thus, P, followed 
by P» followed by P3 followed by P, is a positively oriented piecewise smooth closed curve, 


C4. 
Finally, we note that ff’ 5% — "aa -|[F ~~ 5! +f][ 3 3-5 4A 
Ry? R, O 


[eae | Pedr = [ B-dr+ | Pede+ | Paes f ae ae F-dr = 
C1 C2 Py P2 P3 P, C3 C3 


F - dr. 
C4 
The remainder of the theorem follows inductively. If E’ is a piecewise type one and two 


region that is a union of finitely many regions EF},...,E, of both type one and type two 


i-1 
so that for each i > 1 it is true that Rj NM U E; # 0, and is a piecewise smooth simple 
j=l 
i-1 
curve r : [a,b] > R? which is contained in 0(E;) M aU E;), so that r((a, b)) C (Ei U E;)°, 
j=l 


then we just let Ry be EF, and Rz be E»2 to give us that the boundary of FE; U Eo is a 


0 oP 
positively oriented piecewise smooth closed curve C’, and / | = 5 —dA= if F -dr. 
E\UE 
We then let R,; = FE, U Ep in the argument above and Ro "Es to Oa: that the 


boundary of E, U E> U £3 is a positively oriented piecewise smooth closed curve C, and 


422 CHAPTER 13. VECTOR FIELDS, CURVES AND SURFACES 


OP 
if | — ——dA= ip F - dr. We continue appending additional E; sets until we 
E Cc 


jUE,UE3 oe Oy 
have added them all, at which point we conclude ae the boundary of E is a positively 


OP 
oriented piecewise smooth closed curve C’, and / [F — - 5, tA= [ F - ar. 
Cc 


Finding the area enclosed by a smooth closed curve: 


One of the uses of Green’s Theorem is to find an area A(R) of a piecewise type one 
and two region R whose boundary is a piecewise smooth positively oriented boundary curve 
(the curves for which we have proven Green’s theorem). If P(x,y) = —y and Q(z,y) = 0 


and F(a, y) =< P,Q > then // Op =P aA ‘A F-dr= a 1dA = A(R). Likewise, 
R Cc R 


if. P(e, y) = 0 and. Q(4;9) = or Play) = = and Q(z, y) = 5 then i F-dr = A(R). 
G 


Example 13.15. Use Green’s Theorem to show that the region R bounded by the ellipse 
2 2 

> + z = 1 has area rab. 

Proof. We can parametrize the ellipse as the curve C' defined by r(t) =< acos(t), bsin(t) > 

over 0 < t < 2m. We will set F =< 0,4 >. Then Q, — Py = 1, so by Green’s 


20 
Theorem we conclude A(R) = J. 1dA = | < 0,acos(t) >< —asin(t),bcos(t) > 
R 0 


2a 
at = | ab cos*(t)dt = mab by Wallis’s formula. 
0 


Flux through a closed curve: 


Green’s Theorem provides a shortcut for flux integrals if the flux is to be found through 
a smooth closed curve bounding a convenient region. 


Theorem 13.16. Let C’ be the positively oriented smooth closed curve r : [a,b] > R? 
bounding the piecewise type one and two region E, and let F =< P,Q > be aC! vector field 

2 < y'(t), —2"(t) > 
on R*. Letn= 


Ve")? + Y'()? 
is ff Pet Qua. 


Then the flux integral of F through C in direction n 


b b 
Proof. We note that the flux integral <P,Q>-<y(t),-2'(t) > dt = / Py'(t) - 


a 


Qa’ (t)dt = [Pay — Qdz = [fe + QydA by Green’s Theorem. 


423 


Example 13.16. Let C be the square with sides in the lines x = +2 and y = +2, bounding 
region R. Let F =< 2y,x2 + 3y >. Find the flux through C outwards through the square 
(in the normal direction pointing away from the origin). 


Solution. If C is oriented counterclockwise then the normal direction pointing outwards is 
< y'(t), —a'(t) > 


Vv (2'(t))? + (yb)? 


is the corresponding unit normal direction at every point where the path is differentiable. 
Hence, the flux is // P,+Q,dA = // 0+ 3dA = 3(12) = 36. 
R R 


the one in which we wish to take the flux integral, which means that n = 


Divergence and Curl 


Divergence and curl are useful features of a vector field which have physical applications 
as well as being useful in determining whether a three dimensional vector field is conservative. 
Often, we use notation shortcuts to help is remember divergence and curl as mnemonics. 
A vector field of any dimension has a divergence which is often written as V-F. The idea 
behind this notation is to treat V as though it were a vector whose entries are differentiation 
operators with respect to the variables, and F as a vector whose entries are the coordinates 
of the vector field. Multiplication is then replaced by performing the differentation operator 
on the corresponding coordinate of the vector field. Thus, in three dimensions we would 


have V =< — >and F =< P,Q,R> so that the divergence of the vector field is: 


Ox’ Oy’ Oz 


Definition 111 


Let F be a differentiable vector field < P,, P2,...,P, >. The divergence div(F) is 
defined by: 


Of course, this is not actually a dot product, but the notation helps you remember the 
formula. So, for instance, if F =< P,Q > then F =< P,Q > is a two dimensional vector 


field then we have div(F) = Dun + — and if F =< P,Q, R > is a three dimensional vector 
OP 0Q OR 
field then div(F) = : 
ge ac a i ay 


A similar mnemonic helps us to remember the curl. In this case, we take a cross product: 


424 CHAPTER 13. VECTOR FIELDS, CURVES AND SURFACES 


Definition 112 
Let F be a differentiable vector field. The curl curl(F) is defined by: 


k 


curl(F)=V x P= =< Ry — Qz,P, — Rr, Qz — Py > 


i 
ao @ 
Ox Oz 
P R 


For a two dimensional vector field F =< P,Q > we define curl(F) = Qz — Py. 


Example 13.17. Let F =< x”, 2xy, ye* >. Find the curl and divergence of F. 


Solution. First, div(F) = V-F = 2x + 2x + ye* = 4x + ye*. Next, curl(F) = Vx F = 


ijk 
0 Od <O 

— — =|=<e’*,0,2y> 
Ox Oy Oz . ¢ 
a? Qxy ye” 


It can be helpful to think of the absolute value of divergence as representing net 
expansion of a vector field away from the point at which the divergence is taken (or 
compression if the divergence is negative), and magnitude of curl as representing the largest 
circulation per unit area the vector field spins around a given line, where the curl vector 
itself points in the direction of that line (and the spin is counter clockwise around that 
vector as viewed from a point in the direction towards which the curl vector points). 


As an intuitive informal description, if you think of a given partial like Du as being 


positive then that means that the flow in the x direction is getting larger, so me the 
value of the vector field in the x direction is positive, the amount leaving is larger than 
the amount coming in. We we picture this vector field as representing the velocity of a gas 
then we are saying that speed at which the gas leaves is greater than the speed at which it 
enters, so the gas is becoming less dense relative to the x direction. By adding the quantities 
from all three directions we can get an idea of whether the gas coming in exceeds the gas 
going out. If the divergence is zero then we say that the vector field is incompressible which 
suggests to the mind the idea that the density is neither being increased nor reduced as one 
moves along the vector field. 

The curl is somewhat harder to picture, but it is possible to show that if C). is the 
path around a circle of radius r centered at a point p € R® in the plane perpendicular 
to curl(F)(p) which is counterclockwise as viewed from a point along curl(F)(p) then 
|curl(F)(p)| = lim —, F - dr, where | F - dr is referred to as the circulation along 

r—>0 Tr Ce C 
C;, so that the |curl(F)(p)| is the largest circulation per unit area at p. If we were to take 


r0 72 


1 
any other direction from the point p and calculate lim —~ | F - dr then we would get a 
C, 
smaller value than the absolute value of the curl. If the curl of a vector field is the zero 


425 


vector then we say that the vector field is irrotational, meaning that the circulation per unit 
area (also called circulation density or infinitesimal circulation) is zero. 

Another way to determine whether a vector field F is conservative is by determining 
whether its curl is zero, which we will discuss after Stokes’s Theorem. 


Surfaces 


Definition 113 


Let r : E — R®, where r(u,v) =< 2x(u,v),y(u,v), z(u,v) > is a C’ one to one 
function whose coordinates’ partial derivatives are bounded on EF, where E is an 
open Jordan region in R?. Let r, =< fing Ving Bn S> GUGl ey =—K Wyo Ving Sm So Ibsaic ie Alle 
satisfy the property that r, x ry 4 0 on E. Then we will refer to r as a parametrized 
surface (or a parametrization for a surface) and r(F) as a surface. We say that the 
parametrized surface is regular at the point (u,v) if there is some open set V C R? 
containing r(u,v) so that VM r(D) = r(U) for some open set U C D containing 
(u,v), where r-' : VM r(D) > U is continuous (in other words, r restricted to U is 
a homeomorphism onto VM r(D)). If r is regular at every point of D then we say 
that r is a regular parametrized surface and that r(D) is a regular surface. 

Let D C FE, where U C DCU C E and U is a connected open Jordan 
region. Then we will refer to r(D) as a standard surface and its parametrization 
as a parametrized standard surface or a parametrization for a standard surface. If 
D,, D2,...,Dm are non-overlapping sets as described, the interior of whose union is 

m 


a connected open Jordan region U so that U C U DG Ue # them we reter 
i=l 


m 
to U r;(Dj;) as a piecewise smooth standard surface. We refer to each r;(Dj;) as a 


i=l 
standard surface component of the standard piecewise-smooth surface, and each r; 
as a standard piecewise-smooth surface component parametrization. 


When we refer to a standard surface or a just regular surface r(D) we mean that r is a 
standard (or regular) parametrized surface and that D is a domain satisfying the definition 
above. 


The requirement that r~! : VM r(D) — U be continuous in the definition of regular 
at a point is actually redundant (it will follow automatically if the rest of the definition 
holds). The usual definition of a regular surface allows for multiple parametrizations to be 
used rather than just one, covering a surface S with a set of coordinate maps (allowing 
for messier surfaces), but our definition will be sufficient for the things we wish to do with 
surfaces. 

Another way of saying that a parametrized surface is regular at (u,v) is that r(D) is 
locally the graph of a differentiable function near the point r(u,v), though demonstrating 


426 CHAPTER 13. VECTOR FIELDS, CURVES AND SURFACES 


this will require a little bit of work. Thus, the points (u,v) at which r is regular are the 
points (u,v) so that r(D) has a tangent plane and normal vector at r(u,v). The notions 
of ”standard surface” and regular ”at a point” are not standard conventions, but these 
terms will be helpful in the theorems that follow. It is also worth noting that because of 
the Extreme Value Theorem, the requirement that the partial derivatives be bounded can 
be omitted in defining a standard parametrization (because it follows automatically on a 
compact domain). 


Note that the requirement that the cross product remain non-zero is the same as the 
Ox Ox| |Ox Ox| |Oy Oy 
u Qu u Qu u  Ov| 3 2 
requirement that one of Qu By ; g : g par g . g | is non-zero. 
Ou Ov| |Ou dvl |O 


mu Ov 


Example 13.18. Let z = f(x,y) be a C' function defined on a compact set K containing 
an open Jordan region D. Let G = {(x,y,z) € R®|(x,y) € D and z = f(x,y)} be the graph 
of f. Show that the function r(u,v) = (u,v, f(u,v)) on D is a parametrized surface so that 
the surface r(D) = G is a (standard) regular surface. 


Proof. Since f is C’ it follows that the partial derivatives of each component of r are 
continuous. Since the derivatives of f are continuous on a compact set K containing D 
it follows that they are bounded on K (and hence on D) by the Extreme Value Theorem. 
Since whenever (21, y1, 21) 4 (2, y2, 22) in G is is true that (271, y1) 4 (x2, ye) it follows that 
r(r1, y1) #V(®2, y2). Thus, r is one to one. Finally, r, =< 1,0, fy, > andr, =< 0,1, fy > so 
the cross product is ry, ly =< —fu,—fy, 1 >4 0 on D, which means that r is a parametrized 
surface. Finally, to see that r(D) is regular, for any (u,v, f(u,v)) € r(D) we find r > 0 
so that E = B,(u,v) C D, so that f is bounded on E by the Extreme Value Theorem so 
we can find an M so that M > |f(x,y)| on E. Then the set V = {(2,y,z) € R°|(z,y) € 
B,(u,v) and —M < z < M} is an open subset of R® so that VNr(D) = r(B,(u,v)). Thus, 
r(D) is a regular surface. Since r is C! on K it is also true that r(D) is a standard surface. 


Note that by the same reasoning if y = g(x,z) (or « = f(y,z) respectively) is a 
C' function on a compact set K containing an open Jordan region D then r(u,v) = 
(u, g(u, v), v) (or r(u, v) = (g(u, Vv), u, v) respectively) is a parametrized surface whose range 
is a regular surface. 


Let (uo,v9) € D, a connected open Jordan region, and let r(u,v) be parametrized 
surface r(u,v). We can find B,(uo,vo) C D since D is open. Thus, since r is continuous, 
we can find 6 > 0 so that if |(uo, vo) — (u,v)| < 6 then r(u,v) € B,(uo, v0). Thus, we 
can define C,(t) = r(uo + t, vo), Co(t) = r(uo, vo + t) defined on (—d,6) which are curves 
r(uo + t, vo) — r(uo, vo) 


contained in the surface r(D). Note that C(0) = lim = ty (igs); 
—> 


and likewise C5(0) = rzy(uo,vo). Since these curves are tangent to the surface r(D), the 
vector ry, X Ly(uo, vo) is perpendicular to the surface. This assumes that there is a vector 
perpendicular to the surface, of course, which will be true assuming the surface is locally 
the graph of a differentiable function, which we will see is true if r is regular at (uo, vo). 


427 


We see from this and the example above that the unit normal vector to parametrized 
Ty X Yry(uo, vo) 


in general, and that in the specific case of 
[Pu X Ty(uo, vo)| 


(—fu, fas 1) 
V1+ (fu)? + (fo) 


surface r(u,v) at r(uo, vo) is 


the graph of a function z = f(x,y) the unit normal vector is 5 (uo, Vo), 


assuming these parametrized surfaces are regular at (ug, vo). 


The fact that parametrized surfaces are one to one means that most surfaces one is 
likely to think of are standard surfaces. However, a surface could come back in on itself like 


1 
S ={(x,y,z) € R?| —1 <2 <1 and either y = sin(—) and 0 <2 <1lorz=Oand -1< 
x 


y < 1}. We can see that near the origin, there is no open set containing the origin which 
does not contain infinitely many layers of sheets of curve converging towards a section of 
the surface containing the origin. This means that there isn’t a tangent plane in the sense 
that we have defined the term, or a normal vector at the origin. Even if we extended the 
definition to something more general we could alter the curve so that it had slopes that did 
not converge towards any particular value as they approached the origin so we do have a 
potential problem defining these ideas at points where the surface is not regular. It is thus 
worth investigating what implies regularity. We have already demonstrated (in the earlier 
example) that there is a parametrization for a graph of a function which is regular. 


We next note the following which lets us determine continuity of an inverse function for 
one to one continuous functions on a compact domain. 


Theorem 13.17. Let K C R"” be compact and the f : E > R° be one to one and continuous. 
Then f-!: f(E) > E is continuous. 


Proof. Let {f(Pm)} > f(p) in f(K). Then {p,,} is a bounded sequence (since K is 
bounded) and has a convergent subsequence {p,,,} which converges to a point q by the 
Bolzano- Weierstrass Theorem, where q € K since K is closed. Thus, {f(P,,,)} > f(q) since 
f is continuous. Since {f(P,,,)} is a subsequence of {f(p,,)} we know that {f(p,,,)} > 
f(p). Since f is one to one it follows that p = q. Let € > 0. Suppose there are infinitely 
many integers m so that p,,, ¢ B.(p). Then if we order the correspond sequence members 
into a subsequence {p,,,} (where mj; is the jth integer so that p,,, ¢ B.(p)) then this 
subsequence has a subsequence {P,(m,)} which converges to a point w € K \ B-(p) since 
this set is closed. But then {f(Ps(m,))} + f(w). This is impossible since |w — p| > € and 
we know that {f(Ps(m,))} + f(p), which implies that p = w. Thus, since there are only 
finitely many integers m so that p,, ¢ B-(p), we can choose k € N so that if i > k then 
|p; — p| < €, so f+ is continuous at p. 


Using this theorem, it is now possible to see why the condition that r~! is redundant 
in the definition of regular at a point because if r is continuous on and open U then we 
can find € > 0 so that B,(u,v) C U, so r is continuous on B,(u,v), so r~* is continuous on 
r(B,(u,v)) which means that r~! is also continuous on r(B;,(u,v)). If V = r(D) AV then 
by picking ey > 0 so that B.,(y) C V for each y € r(D) we see that V’ = U Be, (y) is 

yer(D) 
an open set so that V’N r(D) = r(B_(u, v)). 


428 CHAPTER 13. VECTOR FIELDS, CURVES AND SURFACES 


Theorem 13.18. Let r: D > R® be a parametrized surface on the open Jordan region D, 
where (ug, vo) € D. Then r is regular at (uo, vo) if and only if r(D) is a surface which is 
locally the graph of a C* function at r(ug, vo). 


Proof. First, assume that r(D) = S is a surface so that r is locally the graph of a (es 
function at (ug, vo). We will assume that the function is of the form z = f(x,y) (the 
other cases are similar). Then if r(ug,vo) = (xo, yo, 20) there is a C’ function f(x,y) on 
some open ball B.(20, yo) so that there is some open set V so that VN r(D) = G = 
{(x,y,z) € R°\(x,y) € Be(ao, yo) and z = f(x,y)}, the graph of f on B.(xo, yo). Hence, 
if we use the parametrization rj(u,v) = (u,v, f(u,v)) then this is a parametrization which 
is regular at all points (u,v) € Be(xo, yo) (as discussed in the example above). If we set 
U =r '(r1(B¢(ao, yo)) then U is open since for every z € r~'(11(Be(a0, yo)) there is some 
7 > 0 so that B,(ry'(r(z)) C Be(xo, yo) and there is a 6 > 0 so that if |x — z| < 6 then 
r(x) — r(z)| < y which means that |rj7'(r(z)) — ry '(r(x))| < 7 and therefore whenever 
x € B;(z) is is true that x € U, making U open. It follows that r is regular at (uo, vo). 
Next, assume that r is regular at (uo,vo), where r(u,v) = (a(u,v)y(u, v), z(u,v)). 
Ox Ox| |Ox Ox| |Oy Oy 
Then one of Gu gu ; Qu gu ‘ Qu Qu is non-zero at (ug, vo). We will assume that 
0 


Ou Ov u Ovl| |Ou Ov 


Of 208 
gu Bu is non-zero at (ug, vo) (the other cases are similar). We then find an open set 
du dv 
U CR? containing (ug, vo) and an open set V containing r(uo, vg) so that VO r(D) = r(U) 
and r: U + VM r(D) is a homeomorphism. 

If we define z(x,y, z) = (x,y) to be the projection onto the zy plane, then mor = 


Ox Ox 

(x(u,v), y(u,v)) : U > R? and det D(r or) Qu Bu (uo, vo) # 0. Hence, by the Inverse 
du dv 

Function Theorem, it is true that there are open sets U; = B.(uo,vo) and Vi = 70 


r(B.(uo, vo) so that 7 or is one to one, C and has C! inverse g on Vj. We then let 
f(a,y) = 2(g(x,y)), where f : Vi > R. Since g and z are C! we know that f is C! 
on V;. The graph of f is G = {(a,y,z) € R°|(z,y) € Vj and z = z(g(zx,y))}. However, 
z(g(x,y)) is the unique z value corresponding to the point (u,v) = g‘(a,y) € U1 so 
that ((x(u,v), y(u,v), z(u,v)) € r(U,), which means that (x,y,z) € G if and only if 
(x(u, v), y(u, v), z(u,v)) € r(U;), so the image of U; under r is the graph of a differentiable 
function f. If we define W = VN171(V,) then r(D) NW = G, so r(D) is locally the graph 
of a C! function near r(ug, v9) as desired. 


An almost immediate consequence of the preceding theorem is the following: 


Theorem 13.19. Let F : U > R, where U is open in R® containing p = (x0, yo, 20), 
F is C! and F(xo,yo,20) = k. Let VF(a) 4 0 for every « € U so that F(x) = k. Let 
S ={(a,y,z) € U|F(a,y,z) = k} = F-'(k) (the graph of the relationship F(a, y,z) = k) 
and let r: D — R° be a parametrized surface where r(D) = S. Then S is a regular surface. 


429 


Proof. By Theorem 11.28, $' is locally the graph of a C! function, which means that S is a 
regular surface by Theorem 13.18. 


Let r: D > R® be a regular parametrized surface r(u,v) = (a(u,v), y(u,v), z(u, v)) in 
R?. Recall from Theorem 10.4 that the area of the parallelogram with vector sides r,,, ry is 
lr. X Yy|. If we were to take the edges of a rectangle R=[ug, uo + Au] x [vo, vo + Av] Cc D 
with vector sides < Au,0O > and < 0, Av > then if the derivative of the transformation ¢ 
were a constant matrix D whose columns are r,,r, on R, we would have that the function 


Ox Ox, Oy Oy, Oz Oz 
hy, v9 + hg) — = | + =" ha, | 
(ug + hi, vo + he) — O(uo, vo) (S hi 30? Bu hi Bu ho Fu hi re hz) (by the Mean 


Value Theorem for Real Valued Functions on each component function). If we only look 
at points on the rectangle then 0 < hy < Au and 0 < hg < Av then the image of the 


rectangle R (which is, by definition, {(uo + t1Au, v9 + t2Av)|0 < t; < land 0 < te < 


0 
1}) under r would be the parallelogram P = {(x(uo, v0) + 5 fut + Hp ute: y(uo, vo) + 
U v 
Oy 


O O O 
Se Aut + H_Avts, 2(uo,%) + 5 Auth + 5, Auta)I0 < t; < land 0 < ty < 1}. In other 


words, the image of the rectangle would be a parallelogram with sides r,Au and r,Av with 
n m 


area |r, X Ly|AuAv. This means that it is reasonable to think of > lru X Ty|/AuAv 
i=1 j=l 

as approximating the surface area over a rectangle if the function r(u,v) is continuously 

differentiable because then for small rectangles the derivative is approximately constant 

over the rectangle. Much as we defined a line integral with respect to arc length as the limit 

of a sum of function values times lengths on segments of a curve, we define a scalar surface 

integral of a function f on a parametrized (or compact paramatrized) surface S; = r(D) as 


n m 
Jim Jim > f(r(u;,v;))|\tu X rv|/AuAv. This motivates us to define the surface area 


i=1 j=1 
and surface integral of a standard piecewise-smooth surface r(D) as follows. 


Definition 114 


Let r(D) = Sj; be a standard surface. The surface integral (or scalar surface 


integral) of f over Sj; is ii. fds = ‘ei f(r(u,v))|tu X ty|dA. The surface area 
Bi D 


of D is i) i; is — / A |r, X Ty|dA. The surface area or surface integral of a 
Sy D 


standard piecewise-smooth surface is the sum of the surface areas or integrals of the 
standard surface components of the surface. 


In the case where Sj is the graph of a C! function z = g(x,y) over a domain D we can 
use the parametrization r(u, v) = (u,v, g(u,v)) on D, on which we have (using the formula 
listed above) a simplification of the surface area formula: 


a=| | 1+ 92+ 92dA 
eV y 


The corresponding surface integral formula is: 


430 CHAPTER 13. VECTOR FIELDS, CURVES AND SURFACES 


[ fs] [tema ab + ofa 


Scalar surface integrals give a total quantity over a surface for which an amount per 
unit area is known along the surface. For instance, integrating mass per unit area over a 
surface would give the total mass of the surface as the surface integral. 


Example 13.19. Find the surface area of the surface paramatrized by r(u, v) =< ucosv,usinv,v >, 
O<su<1,0<v<tT. 


Solution. Recall surface area of S51 is dh |r, X Y,|dA. In this case, r,, =< cos v, sin(v),0 > 
D 
i j k 
andr, =< —usinv,ucosv,1 >,sory,Xry =det | cosv — sin(v) 0} =< sin(v), —cos(v),u > 
—usinv ucosuv 1 


wT 1 
and |r, x ry| = Vu2+1. Thus, Area(S)) = i ry X Ty\dA = i | Vu? + ldudv. 
D 0 JO 


a 
Setting u = tan @ we have du = sec? 6d6, and thus the integral simplifies to 7 i sec? dO = 
0 


TT 


= (sec(6) tan(0) + In| sec(@) + tan(0)]) ne 5 (v2 + In(V2 + 1)). 


0 


Example 13.20. Find the surface area of the portion of the plane z + 2x + 3y = 6 in the 
first octant (where x > 0, y>0 and z > 0). 


Solution. First, this is a triangular disk so we could just find the length of its base and its 
height, but we will use the methods described above instead. We solve for z = g(x,y) = 
6 — 2x — 3y to get the graph which gives the desired surface, and use the formula A = 


// 1+ 92+ g2dA. When x = z = 0 we have y = 2 and when y = z = 0 we have x = 3, 
D 
so the surface described is the graph of z = g(x,y) over the trianglular disk T given by 
2 
O<a<8and0<y<2- 3° This means that the surface area is I V¥1+4+9dA = 
T 


3V 14. We did not need to use the bounds because T is a three by two triangle, so the base 
has area three, and we are integrating a constant function in this case so we just multiply 
the area of T’ by the constant to get the integral. 


Just as we did with line integrals along vector fields we also define integral of vector 
fields along (or perhaps ” through” would be a better word than ” along”) surfaces which we 
refer to as flux integrals. To evaluate such an integral we need to establish an orientation 
for the surface just as we would establish an orientation for a path. 


431 


Definition 115 


Let S; be a standard piecewise-smooth surface. If there is a continuous function 


NN > S) — $7, the sphere $° = {(a,y,2) € Rolg? 2a? 22 = 1), then we refer to 
N(a,y,Z) as an orientation of the surface $}. 


Some surfaces are not orientable (meaning that there is no orientation function as 
described). The classic example of such a surface is usually the Mobius band, but we 
are not going to prove that this surface is not orientable. Essentially, we assign a unit 
normal vector to every point of the surface. If we can do so in such a way that the assigned 
unit normal vectors vary continuously over the surface S! then that is an orientation. There 
are only two possible orientations for a connected surface. The idea intuitively with a flux 
integral is that if F(z, y,z) = (P,Q, R) represents rate of flow (such as fluid flow) at each 
point then we would like the flux integral to represent the rate at which the fluid passes 
through the surface S; in the direction in which S* is oriented. Thus, we define the flux 
integral to be the surface integral of the component of F in the unit normal direction n 
pointing in the direction of the orientation of the surface. For a given standard parametrized 


Ty Xr ry Xr 
a “1 —° throughout the 


surface r(D), the only possible orientations are 
[ty X Tel ty * Pol 


surface. This is because there are exactly two orthogonal vectors to a surface of length 
one at each point of a surface which is locally the graph of a differentiable function at a 


given point, namely those stated, and if N is an orientation for the surface r(D) then the 
ee 

function N(r(u,v)) > —“—— 
r 


i i defined on D can only take two values (1 or -1). The image 
U VU 


ty <5 
of N(r(u, v)) - el is connected since D is connected and the function is continuous, 
Ty xX Ty 
ae a 
meaning that the image is a single point (either 1 or -1), so either N(r(u,v)) = aed 
u Xu 
ee: 
for all (u,v) € D or N(r(u, v)) = ——“—— _ for all (u,v) € D. 
hy eal 


Definition 116 


The flux integral (or vector surface integral) of a vector field F over a parametrized 


surface r(D) = Sj) to be: Vai EF. 7S = i Ease ia Rigo 
Si oa D 
ae |r. X Yy|dA which simplifies to 


[fF =f [Fetun-r nad 


where r,, X fy is the orientation of S, (or we negate this integral in the case where 
ry X Ly is opposite the direction n in which Sj is oriented). 


[ey Te 


In the case where 5; is the graph of a function z = g(x,y) oriented upwards over a 
domain D this formula becomes: II F-dS = // (P,Q, 2) + (—o2,—gy, 1d A since 
$1 D 


Yu X Ty = (—Ggu, —Gv, 1) and r(u, v) = (u,v, g(u, v)) on D, and we can exchange the names 


432 CHAPTER 13. VECTOR FIELDS, CURVES AND SURFACES 


of the variables u for x and v for y without altering the value of the integral. Hence, this 


simplifies to: 
al F-ds=[ [ —P gx — Qgy + RdA 
Si D 


assuming that the orientation of S$; is upwards (has a positive z component) on 5). As 
usual, we negate the integral if the orientation is the opposite (has a negative z component 
on S$). 

We have not yet proven that surface integrals are independent of the parametrization 
chosen to generate the surface, which is important for understanding surface integrals. 


Theorem 13.20. Let r: E > R® andr; : E; > R? be regular paramatrized surfaces, where 
E and E, are open in R?. 

(a) Then for each x € E we can find some Uz C E so that 6 = ror: Uz > (Uz) C FA 
is aC’ homeomorphism with Ag £0 on Uz. 

(b) Let r: D > R® and r, : Di > R?® be standard parametrized surfaces with r(D) = 
ri(D1), where U CDC UcECR andU,CDCUCEH CR for Jordan regions 
U, D, E, where U,U;, E and E, are open. Let f be a continuous function on r(D). 


then | fas = [ fds. 
r(D) r(D1) 


Proof. (a) Let x € E. We can find a unique zx € Dj, so that ri(zx) = r(x). We 
choose positive numbers ¢x and dx so that r~' and rj‘ restricted to B.,(t(x)) A r(E) 
are homeomorphisms, and that Bs,(x) C r~'(B.,(r(x)) and Bs, (zx) C rp" (Be, (r(x). 

We observe that re or: Bs,(x) — V is a homeomorphism, where V is the open set 
r) | or(B;,(x)) in E,. We note at this point that restricting rj or to any open subset of 
Bs,.(x)) would similarly result in a homeomorphism onto its image. 


Ox Ox| |Ox 
Let ri(u,v) =< z(u,v), y(u,v), z(u,v) >. We know that one of sy Bu : Qu 
Ou Ov! |du dv 


Oy Oy 
Qu Qu is non-zero at each point of &. Without loss of generality, we can assume 
du dv 
Ox Ox 
that Qu ge A 0 at ri(zx). We let F : V x R > R?® be defined by F(u,v,w) =< 
Ou Ov 
gal) pe) | [2G 2G, 
x(u,v), y(u, v), z(u,v)+w >. Then Ar((zx,0)) = Ovo) By of = Au x ap x 
aM a ‘ By (2x) Bp (2x) 


0. Hence, by the Inverse Function Theorem we can find a 0 < 7 < 6x so that By, (zx) C V 
and Ap is non-zero on B,,((zx,0)), F(B,,((zx,0)) = Wx is open, and F-!(x,y,z) =< 
a(x, y,z), B(x, y,2),2 — z(a(x, y, z), B(x, y,z)) > is C1 on Wy. This means that a and 6 
have continuous first partial derivatives. 


433 


Next, note that F'(p,0) = ri(p) for all p € By, (zx), so r) (2, UA a=< Oey, 2), Cay2) > 
on r1(By, (Zx)) C Wx, which means that ry! is C'. Hence, ¢ = rj ‘or isa C’ homeomorphism 


on Ux, =r! ori (By, (Zx)). 


Next, we note that ¢ and ¢~! are C1 and do ¢7! is the identity, so Doo @ 1(z) = 1 for 
all z € By, (zx), which means that det(D¢(¢~'(z))D¢~'(z)) = 1, which means that both 
det D¢(¢~1(z)) and det Dé~'(z) are non-zero. Since ¢ is a homeomorphism, each point on 
Ux is ¢~'(z) for some z € B,,,(zx)). From this we conclude that Ag and A g-1 are non-zero 


on Ux and B,,, (zx) respectively. 


(b) By (a), for each x € D we can pick a ex > 0 so that ¢ = r~ 
We know that D is compact and that C = 


homeomorphism with Ay #4 0 on Be, (x). 


lor is a Ct 


{B.,.(x)|x € D} is an open cover of D, so by the Lebesgue number theorem we can find 
56 > 0 so that if Q is a set of diameter less than 6 which intersects D then Q is a subset 
of an element of C. Let R be a rectangle containing D and let G be a grid on R with 


6 
|G| < 5 Then by the Change of Variables Theorem, i 


R,EG 
Let $(s,t) = (a(s, t), B(s, t)) 
and ri(s,t) =< 21(s,t), y1(s, ¢), 


r1(s,t), y((a(s, t), (s, t))) = yi(s, t) and Alals,t) Ble, 


J rast (ro ¢)|ru X Ly o d||Ay]. We know that rod=ro(r 


r(D) RicG 


1 


x Ox 
| [em me 
we have that Dr,(s,t) = Dro ¢(s,t)D¢(s,t) = a a. (#(s, t)) 38 8s (s,t). This 
a afore & 
gives us: ese 
Ox Ox Oa Ox Op 
pe ne = oe a Begg” 
Y1 ¥y a 7] 

a =~ a 1 gu Sergei 
Zy z a Zz 

a = Sag ol aoa 
£1 xv a xv 

pent = Tae Bue agen 
Y1 _ oY a 7] 

er ge ge a2 a 
ZI z Qa z 

ay 688) = Fas (PlS #)) He CS 8) 1 Fy (PlS 1) Be (84) 

uy six sao ar be ce as: 
ral r a r 

ea = gnc oa of ar he 
rel r a r 

a ' ,t) = Fa (Pls #)) He (88) he Fp Pls Hay (sb): ak - 

From this, we see that r1,xr1, = (tu(@(s, i) 5-(s, t)+r0(6(s, #)) (8,8) x(eu(@(s, N= (s, 
ro(6(s,8)) se (6,4) = r(dle ert dle 55 (8 Fi (6,4) Ful (s, 6) xto((s, 8) (8, 0) 
ru(o(s,t)) x ro(4(s,4)) det | 33 9S (s.2). 

ds at 


fdS = i 


lor 


fort, x t| = 


4 


eae 


for (s,t) € @ 1(R;) and let r(u, v) =< (u,v), y(u,v), z(u,v) > 
z1(s,t) >. Since rod = ry, it follows that x((a(s, t), G(s, t))) = 
t))) = z1(s,t). From the chain rule 


434 CHAPTER 13. VECTOR FIELDS, CURVES AND SURFACES 


Thus, Ss" [. of) o(rod)|fuXryog||Ag| = ae guna ral Suc or 


REG REG 


i fds =f fdS as desired. 
r(D) ri1(D1) 


Changing axis orientation: 


In most cases it is easier to do surface integrals (particularly flux integrals) using the 
formula where we describe a surface S as a graph of a function z = f(x,y). For flux integrals 


oriented up, this is: 
iB, F-as=[ [ —P gx — Qgy + RdA 
Si D 


This is because parametrizing a surface in the first place takes time. Then taking the 
cross ry X ry takes time. Even after doing those two steps the integrand after taking 
F -r, X ry is often worse than the integral in the formula above unless your parametrization 
was carefully chosen (in which case the integrand probably is better for the parametrized 
form). 

However, it is frequently true that a surface is not easily described as the graph of a 
function z = f(x,y) but it can be described as y = f(x, z) or x = f(y, z). In such cases the 
formula above can still be used with minor changes. As discussed in earlier sections, we can 
often just switch the axes and switch the corresponding variables in integrands and set up 
bounds that way so that our picture is more convenient. When working with vector fields, 
however, we have to make sure that the components of those vector fields are consistent 
with the variable switch. If we were to just switch the axes and switch the variables then 
we would still get an equivalent integrand with a nicer picture, but for the vector field 
F =< P(z,y,z), Q(z, y,z), R(x, y, z) >, if we switch which variable is x and which is z, for 
instance, then as long as we switch the region accordingly we will have the same values for 
the entries of F (at points where coordinates are exchanged), but the direction F points will 
not be the same relative to the new position of the axes. This is because F still says that P 
is pointing in the x direction, which has, in the new axes, taken the place of the z-direction. 
So, in addition to switching x with z we must also switch the coordinates P and R of F so 
that the new F =< R(z,y,x),Q(z,y,x), P(z,y,x) > corresponds to the newly positioned 
axes. 

The formulas are as follows in the case of the flux integral above: 

If S; is the graph of y = g(x, z) over D in the xz-plane then the flux integral of F =< 
P,Q,R > through 5S} oriented in the positive y direction is: 


II F-ds= [ | —Pg,- Rg. + QUA 
Si D 


If S; is the graph of x = g(y,z) over D in the yz-plane then the flux integral of F =< 
P,Q,R > through S$; oriented in the positive x direction is: 


ie) F-as=[ [ ~Qg, — Rg: + PdA 
Sy D 


435 


There are similar corresponding formulas for surface area and surface integrals, though 
these depend less on a vector field so they may be more apparent: 
If S; is the graph of y = g(x, z) over D in the xz-plane then the surface area of Sj is: 


a= ff V1+924+92dA 
D 


The corresponding surface integral formula is: 


as fas= ff fle,g(e,2),2)VTF aE + ga 


If S; is the graph of x = g(y, z) over D in the yz-plane then the surface area of $} is: 


A= ff Jr+ap+ oa 


The corresponding surface integral formula is: 


[ f ss=f [ flowauay/lir oj + ads 


Stokes’s Theorem 


Stokes’s Theorem gives us a way to use a flux integral of the curl of a vector field 
through a surface to determine the line integral of a vector field along a closed path which 
is the boundary for that surface. Since curl vectors are often much simpler than the original 
vector field, this is helpful. This also lets us prove results which have been stated earlier 
relating circulation at a point to the curl vector. Here is the statement of the result. We 
will only prove the result for special cases. To prove the general form of the theorem would 
require significant build up and probably be almost as long as the Change of Variables 
Theorem, and it probably isn’t worth it for purposes of a text that is primarily focused 
on calculus methods because for just about every surface most readers would want to use 
Stokes’s Theorem for the special case we will prove is sufficient. In other words, the increase 
in the generality of the result does not seem to be worth the increase in the complication of 
the argument and the higher likelihood that a student would become lost in this case. The 
generalized form of Stokes’s Theorem is even more complicated than the one listed here 
as the general form of Stokes’s Theorem, but it is exceptionally useful in certain areas of 
mathematics and readers who are interested are encouraged to study it. 


Theorem 13.21. Stokes’s Theorem. Let F =< P,Q,R > be a C! vector field on a 
connected open set D containing a piecewise smooth closed surface $1 oriented upwards 
and its boundary (with respect to the subspace topology) is a smooth closed curve C which 


is oriented counterclockwise as viewed from above. Then ie curl(F)-dS = i) F-dr. 
Sy C 


There are some issues with the statement of this theorem. The first is that we have 
not defined piecewise smooth surface. The second is that it isn’t all that clear what we 


436 CHAPTER 13. VECTOR FIELDS, CURVES AND SURFACES 


mean by ”with respect to the subspace topology.” A piecewise smooth surface is a union of 
smooth surfaces and closures of smooth surfaces that only intersect along their boundaries, 
but this is a more general construction than we wish to work with in this text. Recall that 
for a smooth surface, if we restrict ourselves to looking at a portion of the surface in a small 
enough ball then the surface is locally the graph of a C! function (either z = f(x,y) or 
x= g(y,z) or y = h(x, z) over some Jordan region bounded by a smooth curve gives the 
same points as the intersection of the surface with the ball). As a result, we would expect 
that most surfaces we are likely to encounter can be obtained by taking surfaces which 
are graphs of functions over connected Jordan regions which are closures of open sets and 
joining them together along their boundaries. This motivates us to define the following: 


Definition 117 


Let E be a connected open Jordan region in R?. Let D = U Cc E, where U 
is open and connected and D is a Jordan region whose boundary is a piecewise 
smooth closed curve. Let f be a C! function on E so that r,(u,v) = (u,v, f(u, v)) 
is a parametrization for the regular surface r(£). Then r,(D) is a standard smooth 
graph surface in the variable z over the region D in the xry-plane. We also say that 
rz(u,v) = (f(u,v),u,v) and ry(u,v) = (u, f(u,v),v) are standard smooth graph 
surfaces in the variables x and y respectively over the region D in the yz and xz 
planes respectively. We say that a point p € r(£) is in the boundary of a set 
S Cr(E) in the subspace topology on r(£) if, for every € > 0, the ball B.(p) contains 
a point in S and a point of r(F) which is not contained in S. To avoid repetition of 
” with respect to the subspace topology on r(£)” we will refer to the boundary of S 
with respect to the subspace topology on r(£) as the manifold boundary of S. We 
refer to the maps rz or r, or rz as standard graph functions with respect to variables 
x,y and z respectively. 


IfW = U r;(D;), for standard graph functions r; on open sets FE; containing D; 


for each i, tg : union of standard smooth graph surfaces, then we say that a point p 


is in the boundary of W with respect to the subspace topology on Ulse E;) (or just 
i=1 
in the sneer boundary) if every open ball containing p contains a point of W and 


a point of U r;(£;) which is not in W. 
— 


It may be asked why we have to talk about manifold boundary. Well, we are looking at 
the boundary within the surface itself. The topological boundary (the boundary we have 
been discussing in all the other sections of the text) of a regular surface in R® is the entire 
surface in every case. We want to specifically look at the edge of the surface, which is the 
boundary with respect to an extension of a surface to a slightly larger surface within the 
surface itself. In this context we could talk about the manifold boundary of the portion 
of a paraboloid z = 1— 2? — y? for z > 0 being the unit circle on the xy-plane where 
the paraboloid was cut off (but every open ball in R? containing a point on that section 
of a paraboloid will contain points on the paraboloid and points of space that are not on 
the paraboloid, which would would mean the topological boundary (in three space) of the 


437 


surface would be the entire surface). 

When we refer to a manifold boundary for a surface that could be represeneted as a 
piecewise smooth graph surface where a point would be in the boundary then the boundary 
of the surface with the unspecified parametrization includes that point. Thus, we would 
say that the manifold boundary of the surface S consisting of the portion of the paraboloid 
z=1-—27?-y’? where z > 0 was the circle C defined by x? + y” = 1 in the xy-plane because 
we could parametrize the surface as r(u,v, u? + v7) over u? + v7 < 100, and with respect to 
that parametrization the boundary of S would be C.. Essentially, we extend the domain of 
the parametrization to an open set when we take the manifold boundary. 

If 1 is the smooth closed curve C' which is the boundary of D in the above definition 
then r,(C) is the manifold boundary of r,(D). If C is oriented counterclockwise then we 
say that rol or oriented counterclockwise as viewed from above (or if we replace z by x or 
y then as viewed from the positive x or positive y direction). 


Theorem 13.22. Let E be a connected open Jordan region in R?. Let D=U C E, where 
U is open and connected and D is a Jordan region whose boundary is a counter clockwise 
oriented piecewise smooth closed curve C parametrized by L: [a,b] > R?. Let f be aC! 
function on E so that rz(u,v) = (u,v, f(u,v)) is a standard graph parametrization for 
the regular surface r.(E). Then the manifold boundary of S = r.(D) is r.(C) which is 
parametrized by rz 01: [a,b] + R°. 

Likewise, if ry(u,v) = (u, f(u,v),v) or rz (u,v) = (f(u,v), u,v) are the standard graph 
parametrizations for the surface then ry(C) or rz(C) are the manifold boundaries for the 
Ty(D) or rz(D) respectively. 


Proof. We will only prove this for the first case. The other two cases are the same up 
to a re-labeling for variables. First, observe that the path r, ol is piecewise smooth since 
whenever /2!(t)2 + y/(t)2 > O it is also true that \/a’(t)? + y/(t)? + 2/(t)? > 0 (where z(t) = 
f (x(t), y(t))) and the composition of continuously differentiable functions is continuously 
differentiable. Let p = (2,y,f(x,y)) € r-(C). Then (2,y) € C. Let « > 0 be small 


€ 
enough so that B.((a,y)) C E in R?. Since f is continuous we can choose 6 < 7 80 


that if |(z,y) — (s,t)| < 6 then |f(a,y) — f(s,t)| < = Then B5((x,y)) contains a point 
(s1,t1) € D and a point (s2,t2) € D, which means that (s1,t1, f(s1,t1)) € B-(p) \ rz(D) 
and (s9, ta, f(s2,te)) € Be(p) Mr-(D). Thus, r.(C) is contained in the manifold boundary 
of r,(D). 

If a point (x,y,z) € r-(F)\r-(C) then (x,y) ¢ C, which means that there is some 7 > 0 
so that either B,((x, y)) MD =, in which case B,(x,y,z) r-(D) = 9, or B,((x,y)) C D, 
in which case B,((x, y, z))Or.(£) C r-(D). Thus, the manifold boundary of r,(D) is r,(C). 


It is immediate from the definition that standard smooth graph surfaces are standard 
surfaces. Standard smooth graph surfaces have a boundary (in the subspace topology) that 
is easy to see. It is just the image of the boundary of D under r, (or rz or ry, depending 
on which variable the surfaces is a smooth graph in). 


438 CHAPTER 13. VECTOR FIELDS, CURVES AND SURFACES 


Definition 118 


Let E be a connected open Jordan region in R?. Let D =U C E, where U is 
open and connected and D is a Jordan region whose boundary is a counter clockwise 
oriented piecewise smooth closed curve C' parametrized by 1 : [a,b] > R?. Let 
f be a C' function on E so that r,(u,v) = (u,v, f(u,v)) is a standard graph 
parametrization for the regular surface r,(/). Then we say that the piecewise 
simple closed curve Q defined by r, 01: [a,b] + R® (whose trace is the manifold 
boundary) is oriented counterclockwise as viewed from above. If we replace rz with 
ry(u,v) = (u, f(u,v),v) or rz(u,v) = (f(u,v),u,v) in this definition then we say 
that Q is oriented counterclockwise as viewed from the positive y or x direction 
respectively. 

We define a piecewise smooth graph surface inductively. First, a standard smooth 
graph surface is a piecewise smooth graph surface. Next, assume that the union 
of smooth graph surfaces S = 5S; U SoU... U Sz_1 is a piecewise smooth graph 
surface whose manifold boundary is a piecewise smooth closed curve Cz_, with 
parametrization v. Let 5; be a smooth graph surface with manifold boundary C; 
with parametrization w so that 5,91 S C Cy Cy_1 = Q with parametrization 
q: [c,d] > R?, a smooth curve whose orientation induced by w and v are opposite 
to each other, and S;US has manifold boundary C = (C,_1 UC; \q((c, d)). Further 
assume that C' has a parametrization r so that the orientation of Cy_1 \ q((c, d)) 
induced by r is the same as the orientation induced by v and the orientation of 
Cx \ a((c, d)) induced by r is the same as the orientation induced by w. Then SU S;, 
is a piecewise smooth graph surface. 


We will now prove Stokes’s theorem for a standard smooth graph surface. It is convenient 
to have a notation for manifold boundary, so we will use O,¢(S) to denote the boundary of 
S with respect to subspace topology on the set M. 


Theorem 13.23. Special Case of Stokes’s Theorem for standard smooth graph surfaces. Let 
E be a connected open Jordan region in R?. Let D =O C E, where O is open and connected 
and D is a Jordan region whose boundary is a counter clockwise oriented piecewise smooth 
closed curve L parametrized by U(t) =< x(t), y(t) >: [a,b] + R?. Let g be a C* function on 
E so that r(u,v) = (u,v, g(u,v)) is a standard graph parametrization for the regular surface 
S, = r(E) with respect to variable z. Let C be the counter clockwise as viewed from above 
piecewise smooth curve parametrized by ro Ut) =< a(t), y(t), g((x(t), y(t)) >: [a,b] > R?. 
Let F =< P,Q,R> be aC! vector field on an open set U containing S;. 


Then / i curl( F) - dS = | F-dr. 


Si Cc 
Likewise, if S, is ry(D) or rz(D), where ry(u,v) = (u, f(u,v),v) or re(u, v) = (f (u,v), u,v) 
are the standard graph parametrizations for the surface then if the orientation of the manifold 
boundary curve is counterlockwise as viewed from the positive x or y direction respectively 
then II curi(F)-as= | F.-dr. 
Si C 
Proof. The argument is the same in each direction up to a change in coordinate labeling, 
so we only prove this result for the first stated case. 


439 


We define P) (x,y) = P(x,y, g(x,y), Q°Y (a, a) Q(x, y, g(x,y) os es y= 
R(x,y,9(z,y)) on D and observe that P(r(t)) = P)(1 (¢ ): Oey = ae and 
t 


) 
R(r(t)) = R@ (1(t)) for allt € [a,b]. We have [ F-dr eof P(r(t))2’'(t) a Q(r t)dt+ 
Cc a 


b 
i R(r(t))z’(t)dt. Since z = g(x,y) we can use the Chain Rule to write 2/(t) = ae ++ 


Ox dt 
Og dy . 
Bae Hence, we can write the path integral as | (P(r(t)) + geR(r(t)))a'(t) + (Q(r(t)) + 


b 
gyR(x(t)))y! (t)dt = if (PO) (I(t) +.92R™ (I(t) 2! (t) + (Q& (I) +. gy RY AH) )y' (tat. 
b 
So, by Green’s Theorem, we can write / OA ee er eee 


ry) (xy) py) (xy) 
= [ (PO +R g,)dr+(Qh+R™ gy)dy = / [* ee su) _O i I) 4 4 


=f ff QM+R (ry) g, +R” gy 2— Pi) — Re YD gg—R°) guy A. By Clairaut’s Theorem this 


simplifies to = ae Qe) — plow) + REY) g y — ROY gd A. Also, if h(x, y, z) = (a, y, g(a, y)) 

then P™ (a, y) = (Poh)(2,y), Q°(@,y) = (Qo h)(a,y) and R° (x,y) = (Ro h)(a,y). 

Hence, by the Chain Rule we have that Q) ((x, y)) = VQ(h((z, y)))-he (Ge y))= mee y, g(x, y))+ 
Q:(2,y, 9(@,Y))Gx(2,y). Similarly, P™ (x, y)) = Py(x,y, 9(@,y)) +P. (x,y, g(x, y))9y(2,y), 

REY (a, y)) = Ro(x,y,9(@,y))+ R(x, y, g(x, y)) gx (a, y) and RE” (x, y)) > Ry (x,y, g(x, y))+ 


R(x, y, 9(x, y))Gy(«,y). Hence, this integral can be written as i: Q(y) _ Ploy) + REY gy - 


RM andA= ff Qet Qe Py P.gy+Regy-+Reedy—Ryto—RegyaedA = f ff 


Ode (Pe fie) gy FO, —h,dA= If curl(F) - dS. 


Example 13.21. Evaluate the path integral | F. dr where C is the rectangle with vertices 


C 
(0,0, 2), (2,0,2), (2,2,2) and (0,2,2), starting at (0,0, 2) and traversed clockwise as viewed 
from above, and ; 
F(x,y,2) =< z+e" ,5¢ + ycos(y”), 2” + 4a > 


Solution. The path is a closed curve in three dimensions, and the curl of the vector field 
is simple, so we use Stokes’s Theorem. The simplest surface S; that this curve bounds 
appears to be the surface of the graph of g(x,y) = 2 over D = [0,2] x [0,2]. Since the curve 


is traversed clockwise as viewed from above, Stokes’s Theorem would give us that | F-dr = 
Cc 


— i curl(F)-dS, where Sj is oriented upwards, which is equal to // P9x + Qgy, — RdA, 
Ss D 
where a P,Q, R > represent the coordinates of the curl of F, which is: 
i j k 
l(F) =d ges a Be ices Th he flux i li 
curl(F) = det Da Dy Dz =< 0,-—3,5 >. Thus, the flux integral is 
+ 5x + ycos(y*) 27+ 4a 


440 CHAPTER 13. VECTOR FIELDS, CURVES AND SURFACES 


equal to iy 0(0) + —3(0) — 5dA = —5(4) = —20. 


We would like to extend Stokes’s Theorem to a fairly general case which will encompass 
most surfaces we might be interested in, so we include the following theorem. 


Theorem 13.24. Let 5S; be a piecewise smooth graph surface with piecewise smooth closed 
curve manifold boundary C’ given by v : [a,,b;] —> R° so that for every C' vector field 


F on an open set containing S, it is true that // curl F) -dS = / F-dr. Let M be 
Si Cc 
a standard smooth graph surface with manifold boundary B parametrized by the piecewise 


smooth curve w : |[a2,b2] so that MM S is a simple piecewise smooth curve I given by 
L: [a3, b3] > R°, where the orientation given by Lis the same as the orientation induced by 
w and the opposite of the orientation induced by v, and the manifold boundary of S = S;VM 
is K = (CUB) \ I((a3,b3)) which has a piecewise smooth parametrization r : [a,b] > R® so 
that the orientation of C\ U((a3, b3)) induced by r is the same as the orientation induced by v 
and the orientation of B\ I induced by r is the same as the orientation induced by w. Then 


for every C! vector field F on an open set containing S it is true that // curl(F) -dS = 
S 


F-dr. 
Moreover, if S is any piecewise smooth graph surface then S has a manifold boundary 
curve K so that al curl( F) - as= | F- dr. 
Ss K 


Proof. We know that fF dr = f Pedr | Pedr = | F. 
K C\1((a3,b3 )) B\1((a3,b3)) C\1((a3,b3)) 


ar f Fedr+ fPede—fpede= {part f p-de= ff curl(F) - dS + 
B\1((a3,b3)) I I C B sy 


/ | art) sas = i | curl(F) - dS. 


We know Stokes’s Theorem is true for a standard smooth graph surface. Proceeding 
inductively, by definition we see that any piecewise smooth graph surface is a union as 


described by definition, which means, inductively, that if we assume curl(F) -dS = 
S 


F - dr when S is a piecewise smooth graph surface which is a union of k — 1 standard 
K 
smooth graph surfaces then it follows that is S' is the union of k standard graph surfaces 
that ia: curl(F) - dS = ik F - dr as well. We conclude that II curl(F) - dS = | F -dr 
S K S K 


for any piecewise smooth graph surface (for one of the two possible orientations of K’). 


From the preceding theorem, we see that if we can construct a piecewise smooth graph 
surface by adding in a new standard smooth surface repeatedly so that the intersection of the 
previous piecewise smooth graph surface with the new surface is a simple piecewise smooth 
curve with opposite induced orientation along the intersection curve, then the union will be 
a new piecewise smooth surface to which Stokes’s Theorem applies. Since a standard surface 
which is oriented clockwise from above and a standard surface oriented clockwise from the 


44] 


x or y direction which intersect along a piecewise smooth simple curve will always have 
opposite orientations along the intersection we can just keep appending standard smooth 
graph surfaces to construct more complex surfaces to which Stokes’s Theorem applies. 


It is helpful to be able to use curl to decide whether a vector field is conservative. Before 
we can prove this, we must check that we can parametrize a triangular disk. 


Theorem 13.25. Let T be the triangle with vertices (0,0,0), (a1, y1, 21) and (22, ya, 22). 
Then there is a C! parametrization r on a Jordan region D so that r(D) is the triangular 


disk bounded by T, and | F. dr=0 @f curl(F) = 0. 
C 


Solution. Let r(u,v) =u < 24, y1, 21 > +U < 2, yo, 22 > withO<u<landO<v<1-u. 
This parametrization includes all points on the line segment from (0, 0,0) to (1, yi, 21) when 
v = 0. When u = 0 we have all points on the line segment from (0,0, 0)to(2, y2, 22). Any 
point on the line segment from (21, y1, 21) to (Xa, y2, 22) is of the form (21 +t(a@2—21),y1 + 
t(yo— ye), 21 +t(ze—-21) >= (1-t) < 21, y1, 21 > +t < £2, yo, z2 > for some 0 <t< 1. If we 
set u=t then when v = 1—uw this is v < 71,41, 21 > +u < £2, yo, 22 >, so all three edges of 
the triangle are in the parametrization. Since the largest value of v for a given u is 1—u, we 
include no points in the parametrization of the form u < 29, yo, 22 > +k < 41,491, 21 > where 
k > 1—u, which means that we only add vector lengths to each point along the line segment 
from (0,0,0) to (x2, y2, z2) which give points in the triangle (no points beyond the triangle, 
but all the points within the triangle of that form). Every point in the triangle is in the 
parallelogram with u and v as edges, which is p(u,v) = u < 41, 91,21 > +u < Xa, y2, 22 > 
with O<u<land0<v<1. Hence, the parametrization includes exactly the points of 
this parallelogram of form u < 2, y2, 22 > +k < x%1,y1, 21 > which are in the triangle, so r 
parametrizes the entire triangular disk. 


If curl(F) = 0 then i F - dr = 0 by Stokes’s Theorem, since C is a piecewise-smooth 
Cc 
path. 


Theorem 13.26. (a) Let F =< P,Q,R > be a C' vector field on R®. Then F is 
conservative if and only if curl( F) = 0. 

(b) Let G=< P,Q > bea C! vector field on R?. Then G is conservative if and only if 
curl(F) = 0. 


Proof. (a) First, assume that F is conservative. Since F is conservative, there is a function 
jf sothat Vf =] F< faite. Hence, curl E) Sy des See a oe 
0,0,0 > by Clairaut’s Theorem. 

To prove the other direction, we first observe that if T is a closed path which is a triangle 
then T bounds a triangular disk D which is always a standard smooth graph surface by 
Theorem 13.25), so by our version of Stokes’s Theorem we know that if C is the closed 


path along the triangle T’ then d. F-dr= // curl(F) - dS = 0 since curl(F) = 0. For a 


4 
given point (x,y,z), for any point (a,b,c) which is not colinear with (x,y, z) and the origin 
we know that if C is the path along the triangle whose vertices are the origin, (a,b,c) and 


442 CHAPTER 13. VECTOR FIELDS, CURVES AND SURFACES 


(x,y,z) then y F- dr = 0, which means that the integral along the path C; from the origin 
Cc 


to (x,y, z) is the same as the integral along the path C2 from the origin to (a, b,c) followed 
by the path C3 from the point (a,b,c) to the point (x,y,z). The requirement that the 
points not be colinear is only for applying Stokes’s theorem (if they are colinear then the 
line integral along C is still zero since we know that the sum of the integrals along a line in 
one direction and then returning to its point of origin still gives an integral of zero). As long 
as a path consists of only two line segments, then the integral is independent of the choice 
of such paths. We define f(x,y, z) = | F - dr, where L(x, y,z) is the line segment 


L(x,y,2) 
from the origin to (x,y,z). Let L be the line segment from (0,0,0) to (x —1,y,z). Let Ly, 


be the line segment from r(t) =< x—1+t,y,z >, over 0 <t <u, and note that Lj is the 
feu ys7) = Siu, 2) = 


line segment from (a — 1, y,z) to (x,y,z). Then f(a, y, z) lim, 
u> 


u 
Peli PGs tl BA): fPF(a@-14+t,y,z) <1,0,0> dt 
P(a—1+t,y,z)dt Oo f" 
= lim ined v2) = [ Pe-i+tyzidt=P@e-1+4,9,2) Hence, the 
u—0 U Ou 


derivative of f with respect to x whieh u=1is f,(2,y,z) = P(a,y, z) by the Fundamental 
Theorem of Calculus. By renaming variables and repeating the argument we see that 
fy(z,y, 2) = Q(z, y,z) and f,(x,y,z) = R(a,y,z). Hence, Vf = F. 

(b) First, assume that F is conservative. Since F is conservative, there is a function f 
so that Vf = F =< fr, fy >. Hence, curl(F) = fyz — fry = 0 by Clairaut’s Theorem. 

Using the argument above, if we define F(z, y,z) =< P,Q,0 > (where P and Q are 
functions of only x and y just as in the definition of G’)) then curl(F) =< 0,0, Q; — Py >= 0. 
Hence, F is path inedpendent. Let C; and C2 be piecewise smooth paths parametrized by 
r1 : [a1,;] > R? and rg : [a2,b2] > R? respectively, and let K, and K be corresponding 
paths parametrized by Ry : [a;,b;] + R® defined by Ri (t) =< r1(t),0 > and Rg : [az, bo] 9 
R® defined by Ro(t) =< ro(t),0 > repectively, from (x1, 41,0) to (2, y2,0). Let ri(t) =< 


1 
HAAG) Send 1 BO) ES eG aS: Then [ Poa | 2PO.0 >< 
ky al 


by 
x(t), yi (t),0 > dt = | <P,Q>-< x\(t),y{(t) > dt = z) G - dr, and F-dr = 
a C1 Ke 
be be 
/ “POIs R a O 0s a= ie LPO 2) AOS | exde 
a2 a2 C2 
Since we know that - F-.dr= ih F - dr, it must follow that G-dr = G.-dr, so 
Ko Cy C2 


ky 
G is path independent and therefore conservative. 


Gauss’s Theorem 


The Divergence Theorem, also known as Gauss’s Theorem, essentially says that if you 
add up the divergence per unit volume within a solid then the sum of the divergence times 


443 


volumes will add up to the net flex out through the surface bounding the solid. Thus, if you 
think of it in terms of the flow of a gas, the net expansion is the flow of the gas out through 
the boundary. The more general statement of the Divergence Theorem is as follows: 


Theorem 13.27. Divergence Theorem. Let E C R°, where E is a compact set with 
piecewise smooth boundary surface S;. If there is an open set U containing E on which 


vector field F =< P,Q,R > is C! then tel diu(F)dV = a. F-.dS, where S, is 
E Si 


oriented outwards. 


Since our focus in this course is on simpler regions, we will prove the following special 
case of the Divergence Theorem. 


Theorem 13.28. Special Case of the Divergence Theorem. Let E C R® by a region of 
type 1, 2 and 8 whose boundary functions are C', where S is the boundary of E. If 
there is an open set U containing E on which vector field F =< P,Q,R > is C' then 


fee) div(F)dV = // F.-dS, where S is oriented outwards. 
E S 


Proof. Since E is of types one, two and three with smooth boundary functions, it follows 
that there are C' functions z = h1(x,y), z = ho(x,y) with E equal to the solid between 
these two functions over a domain D, in the xy plane which is a region of types one and two 
(so E = {(a,y,z) € R°\(x,y) € D and hi(z,y) < z < ho(x,y)}, and likewise there are C1 
functions y = gi(z,z), y = go(x,z) with F equal to the solid between these two functions 
over a domain D, in the xz plane which is a region of types one and two, and also there 
are C' functions x = fi(y,z), « = fo(y,z) with E equal to the solid between these two 
functions over a domain D, in the yz plane which is a region of types one and two. 

We focus on the first description for now (E is equal to the solid between z = hi(z, y), 
z = ho(x,y) over domain D,). Then the boundary S of E is a union of three surfaces: S$, = 
{(x,y, hi(x,y)) € R?|(z, y) € Dz} oriented downwards, $2 = {(x,y, ho(x,y)) € R3\(z,y) € 
D,} oriented upwards, and $3 = {(x,y,z)|(z,y) € O(D) and hi(z,y) < z < hoa(az,y)} 
oriented outwards (away from D). The argument that this is the boundary is the same as 
arguments we have already given in the proof that a type three region is a Jordan region. 


Bneee)  eeeS Si teee | aes ye 


ha(ax,y) 
Looking at the third of these integrals we see that ci R,dV = II j R,dzdA 
hi( 


1(2,Y) 
= i [ (ho(x,y)) — R(hi(a,y))dA by the Fundamental Theorem of Calculus. The flux 
D 


intooeal [ [vas =f [ wrrarraw ndS = ek (Pi) nase | Qimass ff (Rk- 


n)dS. Looking at the third of these integrals, we note that on $3 the normal direction is 

perpendicular to the z axis at every point (x,y,z) € S3 since the surface $3 contains the 

vector from (x,y, hi(x,y)) so (x,y, ho(x,y)) passing through (2,y,z). Thus, (Rk-n) = 0 

so // (Rk-n)dS = 0. Hence, it follows that | [ce -n)dS = in (Rk - n)dS + 
S3 S S2 


/ | (Rk-n)dS. Using the usual formula for flux integral through a graph of a function 
Sy 


444 CHAPTER 13. VECTOR FIELDS, CURVES AND SURFACES 


with vector field < 0,0, R > we have that = II (Rk -n)dS = // R(x, y, ho(x,y))dA 
So z 
and = — // (Rk-n)dS = // R(x, y, hi(x, y))dA (since the second integral is oriented 
Sy D:z 


doviravrandis), "Tinie: / [ (Rk-n)dS = / [, Rie Re a= / / [ R.aV. 


Repeating this argument over the regions D;, Dy we get that / fi (Pk -n)dS = 


i Qk-n)dS = [| fevav. This wehave that ff aivaryav 


S 


As with Green’s Theorem, while the most general proof would take us too far afield, 
we would like to establish the Divergence Theorem for more general regions than simply 
regions of type one, two and three. 


Definition 119 


We define a piecewise type one, two and three region inductively as follows. A 
solid of type one, two and three is a piecewise type one two and three region. If 
E= R,URU...URm_1 is a piecewise type one, two and three region where each R; 
is a type one two and three region, then if R,, is a type one two and three region so 


that EM Ry is an orientable piecewise smooth graph surface S so that the outward 
orientation from FE on S is opposite the outward orientation from R,, on S, so that 
the boundary of EU R,, is an orientable piecewise smooth graph surface. Further 
assume that 0(E U Rm) = O(E) UO(Rm) \ I(S), where I(S) is the set of points in S 
which are not elements of the manifold boundary of S. Then EU R, is a piecewise 
type one, two and three region. 


Theorem 13.29. Let E be a piecewise type one, two and three region oriented outwards 
with boundary surface S. Let F be a C! vector field on an open set containing E. Then 


IT] div(F)dV = | F.dS, where S is oriented outwards. 
E Ss 


Proof. We induct on the number of type one two and three regions whose union is E. We 


have already proven this for a single type one two and three region. Assume that for any 
m—-1 


m—1 type one two and three regions Rj,..., Rm—, it is true that W = U R;, is a piecewise 


71 
type one, two and three region so that IT] div(F)dV = al F - dS, where T is the 
‘s 


piecewise smooth graph surface which is the boundary of T oriented outwards. Then let 
E=WUR, for some such W, where R,, is another type one, two and three region so 
that W1 Rm = K, a piecewise smooth graph surface so that the orientation of K from 
the outward orientation of W is opposite that of R,,, and the boundary of E is a piecewise 
smooth orientable surface S = O(R,»,) U O(W) \ I(k&), where I(K) is the set of points of 


445 


K which are not on the manifold boundary of K. Then with outward orientation we have 


[Prin [fLowwms] ff wir =[]ras[ + 


dS = 

// Peas [ / peds+f { p-as-[ / P-as= 
T\I(K) O(Rm)\I(K) K K 

II Pass ff peds=[ / Pas. 
T\I(K) O(Rm)\I(K) 8 


The result follows by induction. 


Essentially, if you can make a solid by sticking a bunch of type one and two and 
three regions together and gluing them along their boundary surfaces then you can use 
the Divergence Theorem on that solid. Most solids with connected interiors that you are 
likely to think of can be assembled that way, which means that the Divergence Theorem 
works on most solids you are likely to consider. 

Here is an example using the Divergence Theorem. 


Example 13.22. Let F(x,y,z) =< e* siny — 32,cosz + 5y,z+sin(a?) >. Find the flua 
integral hap F-dS, where S is the sphere x? + y? + 27 = 9 oriented inwards. 
S 


Solution. The sphere bounds a ball FE, which is a convex solid, so the Divergence Theorem 


applies, which states i | F-dS = / fl | div(F)dV, if the orientation is outwards. In this 
Sy EB 
case the orientation is inwards and the divergence is div(F) = —3+5+1 = 3. Hence, 


pe [I a= —3(5)x(3)° = —108r. 


446 CHAPTER 13. VECTOR FIELDS, CURVES AND SURFACES 


Exercises: 


Exercise 13.1. The cycloid is the curve traced by x(t) = r(t — sin(t)) and y(t) = r(1 — 
cos(t)). One arch of the cycloid is traced out over 0 <t < 2m. Use Green’s Theorem to find 
the area between one arch of the cycloid and the x-azis. 


Exercise 13.2. In addition to defining flux integrals of three dimensional vector fields 
through a surface we can define a flux integral through a curve oriented in one direction 
through the curve as follows. Let C be a smooth curve r(t) = (x(t), y(t)) over t € [a,b] in 
R?. Prove the ay H1),2/(t) wit H4)) 
—y(t),v(t y (t),-a'(t 

(a) If we set n(t) (6) "I 
which is perpendicular to r(t). We refer to the particular choice of n(t) as a normal direction 
through the curve. 

(b) Let F(x, y) = (P,Q) be aC’ vector field. For a particular choice of n(t) as described 
in (a), if we define the flux of F through C in the normal direction n(t) to C to be 

e . (y(t), 2’ (t)) 

| F(r(t)) - n(t)dt. Setting n(t) = ma - 


C' is a smooth simple closed curve which oriented counterclockwise which is the boundary 
of a bounded region R then the flux is | Pdy — Qdz = ‘et div(F)dA, and that n(t) is 
R 


Cc 
pointing outwards from R rather than inwards toward R. 


or n(t) = . Show n(t) is a unit vector 


, this is Pdy — Qdz. Prove that if 


Exercise 13.3. Let S; be the surface of the cylinder x? + y* < 4 and 0 < z < 4 oriented 
outwards, but without the bottom of the cylinder. Let F(x,y,z) =< 3x + sin(y?),4y + 


ze” 22 >. Find the flux integral // F-dS. 
Sy 


Exercise 13.4. Find the area enclosed by the curve r(t) =< 2cos?(t),2sin3(t) > over 
O<t<2r. 


4 
Exercise 13.5. Let a(s) =< 3sin(~), 3.cos(—), = S tea regular parametrized curve which 


is parametrized with respect to arc length. Find the curvature and torsion of a(s). 


Exercise 13.6. Let a: (—1,1) > R® be a parametrized curve which is parametrized with 
respect to arc length and let a(0) = so and N= N(O0) be the unit normal to a at 89. Show 
that there if k > 0 at so then there is ad > 0 so that if 0 < |s| <6 then a(s)- N> 0. 


447 


Exercise 13.7. Let F(x, y, z) =< y,—-2, z—-y’—2? >. Evaluate the flux integral ff F-dS, 
S 
where S is the surface of the paraboloid z = x? + y? +4 inside the cylinder x? + y” = 1, 


oriented upwards. 


Exercise 13.8. Evaluate the path integral : F. dr where C is the rectangle with vertices 
Cc 
(0,0, 2), (2,0,2), (2,2,2) and (0,2,2), starting at (0,0,2) and traversed counterclockwise as 


viewed from above, and 


F(z, y,z) =< yzta?+ ec” az + 5a + ysin(y?), cy + 4x > 


Exercise 13.9. Let C be the path consisting of the line segment from (0,0,2) to (1,5, 9) 
followed by the line segment from (1,5,9) to (8,0,6), followed by the line segment from 


(8,0,6) to (1,0,3). Let F(x, y, z) =< e¥,xe¥,e* >. Find [ F.- dr. 
Cc 


Exercise 13.10. Evaluate ip x’ds, where C is the path along the line segment from (2,0) 
C 
to (0,4). 


Exercise 13.11. Find the surface area of surface determined by the following parametric 
equation: r(u,v) =< ucosv,usinu,u >,0O<u<1,0<v<zT. 


Exercise 13.12. Prove the following curl form of Green’s Theorem: Let C be a positively 
oriented smooth closed curve which is the boundary of a piecewise type one and type two 


region E in the xy-plane. Let F =< P,Q,0 > be aC! vector field. Then ‘| F.dr= 
C 


| [ (owt pia 


Exercise 13.13. Prove the following divergence form of Green’s Theorem for flux integrals: 
Let C be the positively oriented smooth closed curve r: [a,b] > R? bounding the piecewise 
type one and two region E, and let F =< P,Q > be aC! vector field on R?. Let n = 


1 ee 
gs U2 oat al Ee . Then the flux integral of F through C in direction n is a diu( F)dA. 
V (2!(t))? + (yO)? E 


448 CHAPTER 13. VECTOR FIELDS, CURVES AND SURFACES 


Solutions: 


Solution to Exercise 13.1. The cycloid is the curve traced by x(t) = r(t — sin(t)) and 
y(t) = r(1 — cos(t)). One arch of the cycloid is traced out over 0 <t < 27. Use Green’s 
Theorem to find the area between one arch of the cycloid and the x-azis. 


Solution. The area of the region R under one arch is enclosed by the path C’ consisting of 
the path C; which is r(t) = (r(t—sin(t)), r(1 — cos(t))), 0 < t < 27 followed by the path C2 
consisting of I(t) = (2mr(1—t),0), 0 <t< 1. This path is clockwise oriented. If we use the 


vector field F(x, y) =< y,0 > then by Green’s Theorem we would have | F-dr = ‘al Py- 
Cc R 
Q,dA = // 1dA, which is the area of R. Evaluating / F-dr =) P-dr+ | F-dr, which 
R Cc C1 C2 


2a 1 
is | < r(1—cos(t)),0 >-< r(1 — cos(t), r sin(t) >at+ f <0,0>-< —2ar,0 > dt = 
0 0 


20 
i r?(1 — 2cos(t) + cos”(t)dt = r?(2a + +) = 3r? by Wallis’s Formula. 
0 


Solution to Exercise 13.2. In addition to defining flux integrals of three dimensional 
vector fields through a surface we can define a flux integral through a curve oriented in one 
direction through the curve as follows. Let C be a smooth curve r(t) = (x(t), y(t)) over 
t € [a,b] in R?. Prove the following: 
(—y'(t), v(t) (y'(t), —2"(t)) 

(a) If we set n(t) = ——~—"_——_ or n(t) = —— 
|r’(¢)| Ir’ (t)| 
which is perpendicular to r'(t). We refer to the particular choice of n(t) as a normal 
direction through the curve. 

(b) Let F(x, y) = (P,Q) be aC’ vector field. For a particular choice of n(t) as described 
in (a), if we define the flux of F through C in the normal direction n(t) to C to be 

b / / 
t),-—x'(t 

i] F(r(t)) - n(t)dt. Setting n(t) = ea. this is i Pdy — Qdzx. Prove that if 
a Cc 
C is a smooth simple closed curve which oriented counterclockwise which is the boundary 


of a bounded region R then the flux is | Pdy — Qdz = fh div(F)dA, and that n(t) is 
R 


pointing outwards from R rather than inwards toward R. 


. Show n(t) is a unit vector 


Solution. (a) We see n(t) is a unit vector because it is divided by its length. To see n(t) or 


(-y¥'(), 2") aay oe 
Se NEE 4G) SO. 
wa Ow) 
) Applying Green’s Theorem, we see that i Pdy — Qdx = // P,+Q,dA = 
Cc R 


(b 
i is div(F)dA. 


orthogonal to r’(t) we just note that the dot product 


449 


Solution to Exercise 13.3. Let S; be the surface of the cylinder x? +y? <4 and0<2<4 
oriented outwards, but without the bottom of the cylinder. Let F(x,y,z) =< 3r+sin(y’), A4y+ 


ze™ 22 —2>=< P,Q,R>. Find the flux integral ial F.-dS. 
Si 


Solution. We will let So by the bottom of the cylinder oriented downwards and S = $;U S$») 
and let E be the solid cylinder bounded by S. We note that II F-dS= a F-dS— 
S 


Sy 
II F dS. 
So 


Using the Divergence Theorem we know that ie. F-dS = IT] 3+4+4+2dV = 
Ss E 


9 MOM) — asa. 


Since 52 is the graph of g(x,y) = 0 over the radius two disk D oriented downwards, we 
have [ f p-as= [ | Poe +Qg,-RdA= | | 0+0+2dA = 8n. Thus, [ F-dS= 
S D D Si 

48m — 8m = 407. 


Solution to Exercise 13.4. Find the area enclosed by the curve r(t) =< 2.cos?(t), 2 sin? (t) > 
over 0<t< 2z. 


Solution. By symmetry, we note that this is four times the area between the curve and the 
w-axis over 0 <t < 3? and that x’ (t) < 0 on the interior of this interval. So, the area is 


4 | * asin (t)(—6 cos*(t)(sin(t)dt = 48 / * sin4(f) cos?(t)dt = 48 (ae 5 =F 
Wallis’s formula. 


4 
Solution to Exercise 13.5. Let a(s) =< 3 sin(~), 3 cos(—), => bea regular parametrized 


curve which is parametrized with respect to arc length. Find the curvature and torsion of 
a(s). 
Solution. Since r is parametrized with respect to arc length, the curvature is K(s) = |a”’(s)|, 


3 3 4 3 
5 cos(=), sin(5), eo T(s) and a"(s) =< 55 sin(5), ae cos(=),0 os 


which means that «(s) = ae 


To find the torsion, we first note that t(s)N(s) = B/(s). Since N(s) = 


where a’(s) =< 


T'(s) 
K(8) 
have N(s) =< sin(—), cos(—), 0 >. We know that B(s) = T(s) x N(s), which means 


we 


5 5 
i j k 
S os, 4 4 4 3 
that B(s) = det 5 cos(=) 5 ly) i) Se 5 cos(=), : sin(=), : >, so B’(s) =< 
sin(<) cos(=) 0 


450 CHAPTER 13. VECTOR FIELDS, CURVES AND SURFACES 


4 4 
55 sin(=), 55 cos(=),0 >. Since T(s)N(s) = B’(s), it follows that 7(s) = ae 


Solution to Exercise 13.6. Let a : (—1,1) > R® be a parametrized curve which is 
parametrized with respect to arc length and let a(0) = so and N= N(0) be the unit normal 
to a at 89. Show that there if k > 0 at sq then there is ad > 0 so that if 0 < |s| < 6 then 
a(s):N> 0. 


Proof. By Theorem 13.8, if we use the coordinate system where a(0) is the origin and the 
direction of the x-axis is T, the direction of the y axis is N and the direction of the z axis 
is B where r(s) = (a(s), y(s), 2(s)), then there are functions R;, Ry, R, : J + R so that 

2 


(i) x(s) =s— Bs Re 


6 
/ 
(i) y(s) = 5s? + Fs? + Ry 
(iii) z(s) = -—s8 +R) 
and fie = en a AG 


s0 S s0 S s0 S$ 
: : Koo Kl 3 : KI 3 
In particular, for some 6 > 0, if |s| < 6 then y(s) = ere — Ry > 0 since both 3° 
and R, have absolute value less than a constant times s°, which is less than any positive 


Ki 
constant times —s* for sufficiently small s. In this coordinate system, y(s) = a(s)-N so we 


are finished. 


2 


Solution to Exercise 13.7. Let F(x,y,z) =< y,—«,z —y* — 2? >. Evaluate the flux 


integral ia: F.dS, where S is the surface of the paraboloid z = x? + y? +4 inside the 
S 
cylinder x? + y? = 1, oriented upwards. 


Solution. This ia flux integral over a surface that is not the boundary of a solid, so we have 
no shortcuts. Since the graph of a Cartesian function is the surface, the formula we would 


use is / | F-dS= / i —P x — Qg, + RdA (where D is the unit disk), which equals 
By D 


yy. y(2x) + x(2y) 4 (Pty +4— ya )dA= ff san. 


Solution to Exercise 13.8. Evaluate the path integral | F. dr where C is the rectangle 


Cc 
with vertices (0,0,2), (2,0, 2), (2, 2,2) and (0, 2,2), starting at (0,0, 2) and traversed counterclockwise 
as viewed from above, and 


F(x, y,2z) =< yzta2?+ e® az + 52+ ysin(y?), cy + 4a > 


451 


Solution. The path is a closed curve in three dimensions, and the curl of the vector field 
is simple, so we use Stokes’s Theorem. The simplest surface S;, that this curve bounds 
appears to be the surface of the graph of g(x,y) = 2 over D = [0,2] x [0,2]. Since the curve 


is traversed clockwise as viewed from above, Stokes’s Theorem would give us that | F-dr = 
Cc 


-| i curl(F)-dS, where Sj is oriented upwards, which is equal to / I Pgx+Qgy, — RdA, 
where < P,Q, R > represent the coordinates of the curl of F, which is: 


i j k 
6) O O ; 
curl(F) = det az By az =< 0,—3,5 >. Thus, the flux integral is 
. +e" Baty cos(y”) 2? 4+ is 


equal to II 0(0) + —3(0) — 5dA = —5(4) = —20. 


Solution to Exercise 13.9. Let C be the path consisting of the line segment from (0,0, 2) 
to (1,5,9) followed by the line segment from (1,5,9) to (8,0,6), followed by the line segment 


from (8,0,6) to (1,0,3). Let F(x, y, z) =< e¥, xe¥,e* >. Find [ F. dr. 
C 


Solution. Using the Fundamental Theorem of Line Integrals is faster than integrating 
directly. We find a potential function f for F by antidifferentiating the first coordinate with 
respect to x to get f = re’ + gi(y, z), the second with respect to y to get f = xe’ + go(z, z) 
and the third with respect to z to get f = e* + g3(x,y). Note that setting g: = e* and 
g2 = e* and g3 = xe" gives f = re¥ + e* in each case, so Vf =F. 


Thus, by the Fundamental Theorem of Line Integrals, i F-dr = f(1,0,3)—f(0,0,2) = 
Cc 


Licnge se, 


Solution to Exercise 13.10. Evaluate i. «ds, where C is the path along the line segment 
Cc 
from (2,0) to (0,4). 


Solution. Note that this is a scalar line integral, so we have no shortcuts and must parametrize 
the line Reement C as r(t) =< 2-— 2t,4t >, 0 < t < 1. Then we use the formula 


[tie [40 f(r (t)|dt to get: [ as= [2 -20°vT4 Tat = 


4 1 
avi fi emcee 4t? + 4t)| = av 
0 


Solution to Exercise 13.11. Find the surface area of surface determined by the following 
parametric equation: r(u,v) =< ucosv,usinv,v >,0<u<1,0<vu<7. 


452 CHAPTER 13. VECTOR FIELDS, CURVES AND SURFACES 


Solution to Exercise 13.12. Prove the following curl form of Green’s Theorem: Let C 
be a positively oriented smooth closed curve which is the boundary of a piecewise type one 
and type two region E in the xy-plane. Let F =< P,Q,0 > be aC! vector field. Then 


i. F-.dr= | | four) -kdA. 

Cc E 

Proof. Let r =< x(t), y(t),0 > over a < t < b be a parametrization for C. We know 

(curl(F))-k =< Ry —Q:, P: — Re, Qz — Py > -k = Qz — Py. By Green’s Theorem, we know 
b 


b 
that [ P-x'(yde= [ <P.Q0>-<a2'(t),u(t),0> d= | POS) aby 
C a 


a-f fa _ Pda = | [(owi) -raa 


Solution to Exercise 13.13. Prove the following divergence form of Green’s Theorem for 
flux integrals: Let C' be the positively oriented smooth closed curve r: [a,b] > R? bounding 
the piecewise type one and two region E, and let F =< P,Q > be aC! vector field on 
Rane 


V@OP+ UO? 
/ | div( F)dA. 
E 


Proof. By Theorem 13.16, the flux integral of F through C' in direction n is II Py + 
E 
ayia = ff div(F)dA. 
E 


Solution. Recall surface area of $j is 1 |r, X Yy|dA. In this case, r,, =< cos v, sin(v),0 > 
D 
i jk 
andr, =< —usinv,ucosv,1 >,sor,Xry =det | cosv — sin(v) 0} =< sin(v), —cos(v),u > 
—usinu ucosuv 1 


wT 1 
and |r, x ry| = Vu2+1. Thus, Area(S)) = i, lr, X Y,\dA = i i Vu? + ldudv. 
D 0 JO 


z 
Setting u = tan @ we have du = sec? 6d6, and thus the integral simplifies to nf sec? 0d0 = 
0 


Then the flux integral of F through C in direction n is 


A; 
4 


5 (sec(9) tan(@) + In|sec(@) + tan(@)|) 
0 


5(Vv2+ In(V2 + 1)). 


Chapter 14 


Multivariable Supplemental 
Materials 


14.1 Matrices 


Definition 120 


The (real valued) matrix A = [ajj]mxn is an array of real number entries with m 
rows and n columns. So, for each 1 <i <m and 1 <j <n we assign a real number 
a,j to be the entry of A in the 7th row and jth column. More formally, we define 
AYto be the tunction (Ahi 4s, 2) — ose thats) 410.7) — 0,7, A 
has m rows and n columns then we say A is an m x n or ”m by n” matrix. Matrix 
addition is much like vector addition. If B = [bjj|mxn is another m x n matrix then 
we define aA + BB = [aai; + Bbij|mxn. If C = [Gij|nxp is an n by p matrix then the 

n 


product AC = DS QikCkj]mxp- When convenient we may denote a row or column 
k=1 
of a matrix as being a vector, meaning that the stated vector has the same entries 


as the corresponding row or column in the same order with respect to the column 
or row entry order respectively. In particular the 7th column vector of a matrix is 
the vector whose entries are those in the jth column of the matrix, and the jth row 
vector is the vector whose entries are those in the jth row of the matrix. Using this 
notation, if A; is the 7th row vector of A and C; is the jth column vector of C then 
we can waite AC —|A, C7 |).7: 

We define T : R” > R” to be a linear transformation if for all x,y € R” and 
G, 2 Ik we lave Tax ay) — a Nix) Poly bere, 00,000) 0h ee 
the vector whose jth entry is one and whose other entries are zero, called the jth 
standard basis vector. 


We also use the notation det(x,,x2,...,Xn,) to denote the determinant of the matrix 
whose ith row entries are those of the vector x;. 


453 


454 CHAPTER 14. MULTIVARIABLE SUPPLEMENTAL MATERIALS 


Definition 121 


If P: {1,2,3,...,n} > {1,2,3,...,n} is a one to one and onto function then we 
refer to it as a permutation on the first n natural numbers. If i < 7 and P(i) > P(y) 
we refer to (7, P(z)), (j, P(j)) as an inversion of the permutation and refer to (i, 7) as 
an inversion pair for P. We define the permutation P;;,;) that interchanges the ith 
and jth natural numbers and maps all other natural numbers to themselves to be a 
transposition. 

We will refer to a product of entries in an n X n matrix A that includes 
exactly one entry factor from each row and column as a permutation product. If 

n 


P: {1,2,3,...,n} — {1,2,3,...,n} is a permutation then we refer to [[ar@ as 
i=1 


n 
the column permutation product for A corresponding to P and [[¢ P(i)i aS the row 
i=1 
permutation product corresponding to P. We refer to the (indexed) a;; terms of a 
permutation product as being entry factors of the product. In a given permutation 
or row or column product for a matrix we say that an inversion entry pair is a pair 
of (indexed) entry factors (ax;,a@,r;) in the product so that k <r andi > j. 


We first make a few observations about the preceding definition. What we mean by 
”indexed” in the definition is that the values of the entry factors must be inverted with 
respect to their index in the matrix (the values of the entries themselves do not determine 
whether the pair constitutes an inversion). So, the permutation product aj2a2)a33 for a 
three by three matrix, for example, refers to the product of the terms itself coupled with 
the associated pairings (1,2), (2,1), (3,3) associated with the permutation. It is the index 
pairings that determine the inversions corresponding to the product. However, when we 
refer to operations on permutation products (like adding two permutation products) the 
output is the non-indexed output corresponding to the operation. So, for instance, the sum 
of two permutation products is a number (not a number coupled with a collection of indexed 
pairs). 

n 
Next, in a given row permutation product II a pj); corresponding to permutation P, a 
i=1 
pair (@xi, rj) = (@p(k)k; @P(r)r) is an inversion entry pair if and only if (k,7) is an inversion 
of P (since being an inversion entry pair corresponds to the ordering of P(k) and P(r) 


being reversed relative to the order of k and r in the permutation P). Likewise, in the 
n 


column permutation product II i P(i) corresponding to permutation P, a pair (ap;, arj) = 
(AkP(k):@rP(r)) is an ‘derdion aan pair if and only if (k,r) is an inversion of P. 

We next observe that for every row permutation product Il api); for A corresponding 
to permutation P, there is a unique permutation Q = P~! so oe the column permutation 


product corresponding to Q is II ajq(i) the same product of the same indexed terms. This 
i=l 
is easy enough to see. A row permutation product has exactly one factor entry from each 


14.1. MATRICES 455 


column, so we simply define Q to be the (unique) corresponding permutation that assigns to 
each column index j the row index of the row i so that P(i) = j. In other words, Q = P™. 
Associated with the observations above, we notice that the pairs (az,,@;z) which are 

nm 


row inversions for a given row permutation product II api); are the same pairs which are 


i=1 
n 


inversions for the corresponding column permutation product [[: p-1(i). This is because 

i=1 
k = P(s) and r = P(t) and so s = P~!(r) and t = P~'(t). Thus, the order of P(s) and 
P(t) is reversed relative to the order of s and t if and only if the order of r and k is reversed 
relative to the order of P~'(r) and P~'(k). It follows that the pairs which are inversions 
in a given permutation product (and the number of such inversions) is determined by the 
permutation product terms themselves with their corresponding indices, and is independent 
of whether the product is considered as a row permutation product or a column permutation 
product. Thus, in our definition above, what we have called an inversion corresponds to an 
inversion as a given permutation product is considered as a row permutation product or a 
column permutation product. 

Finally, we observe that the number of inversions in the transposition P;;,;) is 2(j 7) —1 
if 7 > 7 (which is an odd number) since each integer between 7 and j is out of order with 
respect to the positions of both 7 and j (giving i — j — 1 inversions twice, once for being in 
reverse order with respect to 7 and once for j) and also 7 and j are out of order with respect 
to each other (but this is just one inversion) in the transposition. 


Definition 122 


We will define the sign o(P) of the permutation P to be 1 if the number of 
inversions of the permutation is even, and -1 if the number of inversions is odd, or in 
other words o(P) = (—1)', where m is the number of inversions of P. We define the 


determinant det(A) = |A| of an n x n matrix A = [ajj]nxn to be S"o(P) [[ air: 
i=1 


P 
where P is understood to range over all permutations of the first n integers. 


From this definition, we see that if a matrix has one entry it has one permutation (the 
identity with zero inversions) and so the determinant is |a11| = a1;. For a two by two 
matrix there are two corresponding permutations of column entries by row, either keeping 
the column entries the same as row entries or interchanging them (corresponding to one 


a21 
it takes more time to compute, by determining the six possibly permutations of three 


Q11 412 413 
integers and counting their inversions we get the result that if A = }a21 a@22 ad23} then 
431 432 433 
det (A) = (a11422433 + @12423431 + 413421432) — (411423432 + 412421433 + 413422431). 
We first observe that whether we permute the row or column entries we arrive at an 
equivalent definition of determinant. 


inversion) which means that if A = | then det(A) = aj1a22 — a21a12. While 
22 


456 CHAPTER 14. MULTIVARIABLE SUPPLEMENTAL MATERIALS 


Theorem 14.1. Let A = [ajg]nxn. Then det(A) = > o(P) [[eewi- 
P i=1 


Proof. As discussed above, the row permutations products are exactly the column inversion 
n 


products and their signs are the same. So, det(A) = NS o(P) II api = ss o(P) II Q;p-1(i) = 
i=1 P. i=1 


P 
b> o(Q) Il UQ(i): 
i=1 


Q 


Theorem 14.2. Let P,Q be permutations of {1,2,3,..,n}. 

(a) P is a finite composition of transpositions 

(b) If P is a composition of a finite sequence of k transition permutations then o(P) = 
ely. 

(c) o(PoQ) = o(P)a(Q). 

(d) If P can be written as a composition of an even number of transpositions then P 
cannot be written as a composition of an odd number of transpositions. If P can be written as 
a composition of an odd number of transpositions then P cannot be written as a composition 
of an even number of transpositions. 


Proof. (a) If P is the identity then we apply zero transpositions, which is finite and we are 
finished. Otherwise, let i; be the first integer so that P(t1) 4 i1. Then P(i1) > a1. The first 
transition we apply is P(,,p(i,))- This permutation agrees with P on integers {1,2,..., 71} 
since the transition did not change the images of integers preceding 7,. Let ig be the first 
integer greater so that P(iz) A Po, pri,))(ig)- Then P(iz) > ig and ig > i1. We assign the 
next transition to be performed to be Py, piz))- This transition did not move any of the 
preceding integers on which P agreed with P;;,,p(;,)) and, in fact, must agree with P on 
{1,2,3,...,72}. We continue until after a certain number of transpositions (no more than 
n — 1) we have a finite sequence of transpositions whose composition is equal to P. 

(b) If we perform a transposition P;,;) after a permutation P we interchange P(i) and 
P(j) (we will assume P(i) < P(j)). By doing so, for composition Pi,;)°P, we do not affect 
inversions for integers paired with images which are not in the interval [P(i), P(j)] because 
their order relative to the order of their image under P;;,;) 0 P is the same as the relative 
order under P. However, for any integer w so that P(i) < P(w) < P(j) the relative order 
of i and w to P(i) and P(w) is reversed. That is, if 7 < w then P(i) < P(w) would not 
have created an inversion but now Pj) 0 P(i) = P(j) > P(w) so an inversion has been 
created. Likewise, if w < i then there would have been an inversion previously under P and 
now there is not under P;;,;) 0 P. Either change would change the number of inversions by 
one. Likewise, the relative order would be reversed with respect with respect to 7, meaning 
that if (w,j) was an inversion under P then it is not under P;;;) 0 P and if (w,7) was not 
an inversion under P then it is under P¢;) 0° P. This means that there have been a total 
of 2(P(j) — P(i)) inversion changes from integers w so that P(w) € (P(i),P(j)). There 
is exactly one additional inversion change from the inversions under P and those under 
Pq) ° P, and that is the inversion (or absence of inversion) (i,j) itself. Since that is the 
only inversion change which is not matched with a corresponding inversion change, the total 
number of inversion changes is odd (specifically 2(P(j) — P(i)) +1), which means that if n; 
is the number of inversions for P and ng is the number of inversions for Pi,;) 0 P, then the 


14.1. MATRICES 457 


difference is odd, meaning that o(P,,;) 0 P) = (—1)"o(P) where m is odd, and therefore 
o(P) = —o(Puj) ° P). 

(c) By (a) we can write P and Q as products of transpositions, each of which multiplies 
the sign of the permutation by -1 by part (b). Hence, if P is the composition of k 
transpositions and Q is the product of t transpositions then the o(Po Q) = 0(Qo P) = 
(-1)**' = (-1)4(-1)' = o(P)o(Q). 

(d) By (b) we note that if P is a composition of an even number of transpositions then 
o(P) = 1 and if P is a composition of an odd number of transpositions then o(P) = —1. 
Since the number of inversions in P is a fixed number, and determines o(P) uniquely, is is 
impossible for P to both be a composition of an even number of transpositions and also an 
odd number of transpositions. 


Definition 123 


Let P: {1,2,3,...,n} - {1, 2,3, ...,n} be a permutation. If P can be written as a 
composition of an odd number of transpositions we say that P is an odd permutation. 
If P can be written as a composition of an even number of transpositions then we 
say that P is an even permutation. 


Definition 124 


The minor Mj; if the indexed a;; term of a matrix A is the determinant of the 
matrix obtained by deleting the 7th row and jth column of the matrix A. The cofactor 
of aig is Cy = (—1)'9 Mi;. 


Note that for a two by two matrix, if we move along the first row and multiply each entry 
by its cofactor and add the resulting products we get the determinant of the matrix. It 
takes slightly longer to compute it, but this is also true for a three by three matrix. That is, 
11 412 413 
Gq1 422 a23) = a11 
231 432 433 


a21 a22 
a31 432 


22 423 
a32 433 


G21 423 


— ajo . This is not a coincidence 


n 
as we will discuss shortly. We will refer to a sum y apjCy; for an n X n matrix as an 


k=1 
n 


expansion by cofactors along the jth column, and y aizCik aS an expansion by cofactors 


k= 
along the ith row. 


n 
Theorem 14.3. Let A = [aij]nxn wheren >1. Then for any 1 <i<n, |A| = S anCis 
k=1 


n 
and for any1 <j <n it is true that |A| = woe 
k=1 


458 CHAPTER 14. MULTIVARIABLE SUPPLEMENTAL MATERIALS 


Proof. We proceed by induction on n. It is certainly true when n = 2 by inspection since 
we simply multiply the minors (opposite diagonals) to each of the entries and multiply by 
negative one if the products are along the forward diagonal and -1 along the backwards 
diagonal and note that each of the four possible expansions does give the determinant. In 
other words, @11@22 — @21412 = 422011 — @12G21 = —A21 412 + a11A22 = —a42421 + A22441. 
Next, we assume that the determinant of an m xm matrix can be obtained by expanding 
by cofactors along the ith row. We then let A = [ajj)m4ixm+1 be an m+1 by m+1 matrix. 


m+1 m+1 
To expand along the tth column gives Ss" dst(—1)*t* Mz, and |A| = Ss" o(P) II Q;P(j) 
s=1 P j 
For each pair (s,t) the minor My = S$" o(Q IG av ) where Q is a permutation of the 
Q 


first m integers, Des = ajqy) if 7 < s and ae ae aes os A = ajt1qij) if 7 = s and 


OG t oS = aj(q(j)4+1) if J < s and Q(j) = t and (o0) = = a541)(Q(j) +1) if j = s and 
Q(j) => t. We refer to the indexed entry of A equal to the listed a as the entry of A 


corresponding to be: 


m+1 
The permutation products in the determinant sum So(P) II ajp(j) Which contain 
P j=l 
Gst as an entry factor are exactly the products of the form 


DAs ee ore ye Gab se ee : Each such product corresponds to a permutation 


P: {1,2,3,...,.m+1} > {1,2,3,..,m+1} where P(i) = Q(i) if Q(@) < s, P(s) =t, 


and P(i) = Q(i) + 1 if Q(t) => s, so that Rete ae ida osc: ae = 


Q1 P(1)@2P(2)-++sP(s)-+*%m4+1P(m+1): 
Next, we wish to Coe the difference between the number of inversions in the 


m+1 
permutation product TL ae and the permutation product II aj;p(j). We first notice 
j=l j=l 


that if 7 < 7 and Q(i) < Q(j) then P(i) < P(j) since P(j) is one of Q(j) or Q(j) +1 
and if P(i) = Q(t) +1 then P(j) = P(j) +1. Similarly, if i > 7 and Q(t) < Q(j) then 
P(i) < Q(i) +1 < Q(y) < P(J) and since P is one to one this implies that P(i) < P(j). 


Thus, every entry pair One Dias) is an inversion pair for the permutation product 


(s,t) 
ee 


= 


) if and only if the pair of corresponding entries is an inversion pair for the permutation 


j=l 


m+1 
product II a; P(;) 
j=l 


Since, comparing corresponding entry pairs in the permutation product 1 py’ Bes there 


j=l 
m+1 
are no changes in the inversions in pairs that only include entry factors of II a; p(j) other 
j=l 
m+1 
(s,t) 2 
than a,;, the number of inversions of II a; p(j) minus the number inversions of ibe by (j) 18 
j=l j=l 
equal to the number of inversions of the form (ast, a; p(j)) for j # s. 


14.1. MATRICES 459 


Ifi < 8, apo) = Gaga fl <i <t and apy) = aga 41 if t > t. Thus, if we let k 
be the number of integers 7 so that 1 <i < s and Q(i) > t then it follows that we get k 
inversion pairs (dst, 4; p(;)) Where a,;p;;) corresponds to ee for some i < s. Likewise, it 
follows that there are s —k—1 = y integers i so that 1 <i < s so that Q(i) < t with A; P(;) 
corresponding to bos where the (ast, @;p(j)) pair is not an inversion pair. 

On the other hand, if 7 > s then = = ai+41Q() if Q(i) < t and a = aiz41Q()41 if 


Q(t) = t. Hence, if Q(z) < t and a;p(;) corresponds to a then (as¢, @;pcj)) is an inversion. 
If Qi) >t andi >s theni+1> sand P(i) = Q(i)+1 >t, so the pair (ast, a;p(j)) (where 


5 P(j) corresponds to ps ae, is not an inversion. 


Let r be the number of integers i so that s < i < mand Q(t) < t. Notice that rt+y = t—-1 
since there are a total of t — 1 integers that have images less than ¢ under a permutation 


since permutations are one to one. Then the total number of inversions entry pairs for the 
m+1 


permutation product II a;p(j) containing as; as one element of the pair is r + k, which is 
ae m+1 
the difference between the number of inversions in II a;p(j) and the number of inversions 
j=l 
m 
in ee Let x denote the number of integers i so that i > s and Q(z) > t. Since there 
j=1 
wen total of m+1-—t integers 7 so that Q(i) > t it follows that +k =m-+1-t. Likewise, 
r+x=m+1-—s. 
Since y+k = s—1 and y+r=t-—1, adding these equations gives s+¢t—2 = 2y+k-+r, 
o (s+t) —(k+1r) = 2(y+1), which means that the difference between s +t and k +r is 
even. Hence, o(P) = (—1)**"o(Q) = (—1)***'o(Q). This means that we can represent the 


sum of the signed permutation products containing a,; as ( TO ae IL p! ee 
m+1 m+1 m m+1 
Tas, Ll = 2 0(P) TT espe = 27 96 a ot@) TT bay = De Dane, 
P j=l s=1 ae s=1 
m+1 
which is the expansion by cofactors along column t. Similarly, |A| = > o(P) II ajP(j) = 
P 

m+1 m+1 
s ase Y_(-1)***o(Q OTDM = S 2 (-1)*taiseMets the expansion by cofactors along 
t=1 Q t=1 
row 8. 


Definition 125 


For a matrix A, the operations of interchanging two rows, adding a multiple of 
one row to another row and and multiplying a row of A by a non-zero number are 
called standard row operation. The n by n identity matrix I is the matrix [ajj]nxn so 


460 CHAPTER 14. MULTIVARIABLE SUPPLEMENTAL MATERIALS 


that aj; = 0 if i Aj and one if i = 7. We can also refer to performing row operations 
on a system of equations listed with first equation then a second below it and so on 
until an nth equation is listed at the bottom with analogous meanings. Interchanging 
rows 7 and j is interchanging 7th and jth equations from the top equation, adding 
a multiple of the ith equation to the jth equation is adding the ith row to the jth 
row and multiplying the 7th row by a non-zero constant refers to multiplying the ith 
equation by that non-zero constant. 


Theorem 14.4. If a standard row operation is performed on a system of equations then 
the set of solutions to the system of equations is unchanged. 


Proof. Let the system of equations be as follows: 


ajit, + a42%2 +... + Aintn = C1 
Q2121 + a99%2 +... + Gann = C2 


Am1U1 + Am2%2 +... + AmnIn = Cm 


A set of entries 11, %,...,%, makes the equation true if and only if every equation in 
the system is true. If so then multiplying both sides of the ith equation by a non-zero 
number a results in an equation where the equation is still true. Likewise, a(aj1271 + aj2¥2+ 


. + Gin¥n) = ac; for a non-zero number a then multiplying both sides by — the equation 


original equation is true. Hence, values x1, 29,...,%, are a solution to the omemal system 
if and only if they are solutions to the system where one row is multiplied by a non-zero 
constant. 

Interchanging two equations does not change the system at all, merely the order in 
which they equations are listed, so this does not alter the set of solutions to the system. 

Finally, if 71, 22,...,% is a solution to the original system of equations then for any 
iA#j in {1,2,3,...,m} and 6 € R since we know that ajj71 + ato +... + Gintn = CG and 
aj121 + ajote +... + Qjnin = cj, it is also true that B(au%1 + iota +... + Qintn) = Bei. 
Thus, (aj1 + Bai1)x1 + (aj2 + Baiz)a2 +... + (ajn + Bain)z2 = cj + Bei, so the system of 
equations where aj12%1 + ajoto +... + @jn¥n = cj is replaced by (aj1 + Bai1)x1 + (aj2 + 
Bajyz)%2 + ... + (Ajn + Bain)t2 = cj + BG is true. Likewise, if the system of equations 
which is the original system except that the equation aj12%1 + ajg%2q +... + Ajn®n = Cj is 
replaced by (aj1 + Baj1)a1 + (ajo + Baig)re +... + (ajn + Bain)t2 = cj + Bc; is true, since 
Qj1%1 + 4j2%2+...+4inLn = cj is true, we know that —G(aj;1271 +aj2%2+...+Gintn) = —B; is 
true. Adding this equation to (aj;1 + Baj1)x1 + (aj2 + Baiz)z2+...+ (ajn + Bain)t2 = cj + Be; 
as before we see that aj12%1 + aj2%2 +... + Gjn%n = Cc; is true. 

Hence, the system that results from performing a row operation on a given system of 
equations is always a system with the same solutions as the original system of equations. 


Theorem 14.5. Let A= [ajj\nxn be a matriz. 


14.1. MATRICES 461 


(a) If we multiply a row or column of A by a constant k, then the determinant of the 
resulting matrix is k times the determinant of the original matriz. 

(b) If we switch two rows or two columns of A then the determinant of the resulting 
matriz is the additive inverse of the determinant of the original matriz. 

(c) If two rows or columns of A are multiples of each other then the determinant is zero. 

(d) If we add a multiple of one row (or column respectively) of A to another row (or 
column respectively) of A then the determinant of the resulting matrix is the same as the 
determinant of the original matria. 


Proof. (a) Since each summand in the determinant is a product of terms that includes 
exactly one term from each row and exactly one term from each column. Hence, multiplying 
a column or row by & would multiply exactly one factor in each summand by k which would 
multiply the determinant by k. 

(b) We proceed by induction. If n = 1 then the result is trivial because there aren’t two 
rows or columns to switch. For n = 2, the result is immediate since switching two rows or 
columns results in determinants a21@12 — 422411 and a 12421 — @114@22 respectively, which is 
—|A| in each case. Assume the statement is true for a k x k matrix with k > 2. Let A be 
ak+1xk+1 matrix. If we switch the rth and jth rows of A to achieve a matrix B then 
since k+ 1 > 3 there is another row, say the ith row which was not switched. Expanding 


n 
by cofactors along the ith row gives us |B| = S- aie(—1)) Min, where each Mj is the 


determinant of the corresponding minor in eter ie rth and jth rows have been switched 
from the corresponding minor in matrix A. Since each minor was negated by the inductive 
hypothesis, it follows that the determinant is negated as well. The argument for switching 
two columns is similar. 

(c) If two rows or two columns of A are the same, then switching those rows or columns 
results in the same matrix which has the same determinant. However, by part (b) the 
determinant should be negated, meaning |A| = —|A], so |A| = 0. By part (a) it follows that 
if one row is k times another then the determinant of the resulting matrix is k(0) = 0 as 
well. 

(d) By replacing a row r; of matrix A by r; + kr; (where r; is the jth row of A) and 
taking the determinant of the resulting matrix B, this changes the entries of each summand 
in the determinant containing a factor a; by replacing that factor by aj, +kaj;z. This means 
that the resulting determinant is det(A) + det(A’), where A’ is the matrix obtained by 
replacing the ith row of A by kr;, whose determinant is zero by (c). Thus, the determinant 
of the resulting matrix is the same as that of A. The argument for adding a multiple of a 
column to another column is similar. 


Definition 126 


The coefficient matrix for a system of equations: 
a41%1 + 449%Q2 +... + Ain®n = C1 
G21 21 + A22%2 +... + AanXn = C2 


An1XL1 + An2%2 +... + Ann¥n = Cn 


462 CHAPTER 14. MULTIVARIABLE SUPPLEMENTAL MATERIALS 


| ie bhestiatrixs A= |aeiln ce | 


Theorem 14.6. Let A be the coefficient matrix for the system of equations: 
44121 + Q49%2 +... + An Ln = C1 
Q21X21 + A99%2 +... + GanLn = C2 


Ani L1 + An2X2Q +... + AnnXn = Cn 

which is also written Ax = c, where c = (c1,2,.--,;Cn) and @&= (£1, £2,..., In). 

Then det(A) 4 0 if and only if there is a unique solution to the system. If det(A) = 0 
then there are either no solutions to the system of equations or infinitely many solutions to 
the system. 

Furthermore, the determinant of ann by n matrix A is non-zero if and only if there 
is a sequence of standard row operations that can be performed on A which results in the 
identity matrix. 


Proof. First assume det(A) #4 0. Since no column of A is a zero vector column (since 
otherwise the determinant would be zero by 14.3), there is a row, say the jith row, which 
contains a non-zero entry in the first column. Thus, we can add a multiple of row 7; to the 
other rows so that the entry of the first column is zero in the resulting matrix in all other 
rows (and the determinant of the resulting matrix is the same as the original matrix). Then 
switch the 7,th row with the first row if 7; # 1, so that the only entry in the first column 
which is non-zero is in the first row. This either leaves the determinant the same or negates 
it. Then divide the first row by its first entry so that the first column consists of a 1 in 
the first row and a zero in ever other row. The resulting matrix B, has determinant of By; 
which is non-zero by Theorem 14.5. 

Expanding by cofactors along the first column shows that the determinant of B, is the 
determinant of the matrix obtained by deleting the first row and column of B,. Thus, since 
By, has non-zero determinant, the second column contains a non-zero entry in some jath 
row where jg > 1. Add constant multiples of the joth row to the other rows to make all 
entries of the second column zero except for the entry in the joth row. Notice that such 
adding does not affect the first column entries since the first entry in the joth row is zero. 
Then switch the jath row with the second row if jg # 2. We then divide the second row by 
its first non-zero entry to get matrix Bj, where the only non-zero entry of the first column 
is a one in the first row and the only non-zero entry of the second column is a one in the 
second row. The matrix By also has non-zero determinant by Theorem 14.5. 

Continuing in the manner, we eventually obtain matrix B, = I. Performing the 
corresponding row operations to the system of equations results in equations 1(x;) = k; 
in each row, giving a unique solution to the system of equations. 

Next, let det(A) 4 0. We proceed as before. If there is a non-zero entry in the first 
column of A then we can create a matrix B, as described before, except that det(B,) = 0 
since the row operations performed on A to get B, altered the determinant in a manner 
that multiplied the original determinant of zero by non-zero numbers. If every entry of the 
first column is zero then there are two possibilities. The first is that there is a solution in 
the remaining variables (2, 73, ...,%,) meaning that the following is true. 

ajyo%q +... + A1ntn = C1 

agg%2 +... + dantn = C2 


14.1. MATRICES 463 


Gn2%2 +... + Ann&n = Cn 

In that case, if we set 2; to be any value ¢t then (t,22,23,...,%n) is a solution to the 
system, so there are infinitely many solutions. If there is no solution to the system listed 
above (which is the same as the original system) then there are no solutions to the system 
of equations. 

Continuing, if we were able to obtain B, then we proceed as before to obtain Bo unless 
there is no row below the first row with a non-zero entry in the second column, in which 
case we let aes = By, and let the constant column have entries pi) oe?) ahs, py” after 
the row operations performed A to change A to B, have been applied to the constant 
column. Recall from Theorem 14.4, that the solutions to the system Byx = b™), where 
bY = co”, Be?) tah 0s”), are the same as the solutions to the original system of equations. 
There are again two possibilities. Since there are no non-zero entries below the first row in 
the second column, either the corresponding system of equations indicated below the first 
row: 

wag +... tO ay = WW? 


by arg +... + ban = v4) 


oe3 +... +bVay = 
has a solution or it does not. If it does not then there is no solution to the system. If 
it has a solution (x3, 24,...,2%) then for every value t = x2 we can back substitute to get 
LS pi) et oD x3 = ban, which gives a solution to the first equation as well, 
so there are infinitely many solutions. 


Now, if we are able to perform a sequence of standard row operations on A and end 
with B, = I then by reversing each row operation (switching rows that were switched, 
adding negative the multiple of rows to other rows for which the original multiple of rows 
was added to other rows, and multiplying rows by the multiplicative inverse of constants we 
multiplied them by before) in reverse order we see that a finite sequence of row operations 
applied to J results in A, which means that det(A) 4 0 since applying every standard row 
operation multiplies the determinant by a non-zero number, and the determinant of J is 
one. This is impossible since det(A) = 0. 

We conclude that at some stage in our row reduction process there is some 6; so that 
we cannot create Bj, as described above because all entries in the j + lst column below 
the jth row are zero, in which case, as before, we have two cases. Either the corresponding 


bey ghj4+1 tee + Oe nn = Daa 


bY) aaj +. + Gan = oY) 
has a solution or it does not. If it does not then there is no solution to the system. If it 
does have a solution then for any value of t = xj+1 it follows that by back-substituting to 
solve for the remaining variables we can find a corresponding solution to the system. In 
other words, by setting: 


i= by = belt sac OY) say40 ae So 1! 2, 


ca aos of, = Bg 2s - beat aa a Pant 


464 CHAPTER 14. MULTIVARIABLE SUPPLEMENTAL MATERIALS 


r= pid) — oD) a9 — bas Se Dat =.= bon 


we have a solution to the system. Hence, there are infinitely many solutions to the system 
of equations. 


Theorem 14.7. Cramer’s Rule. Let 
a110, + 4429%Q +... + A1nEn = Cy 
Q21X1 + A99%2 +... + Gan®n = C2 


Ani L1 + An2QX2 +.» + AnnLn = Cn 

be a system of equations with coefficient matrix A having non-zero determinant. Let Az, 
be the matriz A with the ith row replaced by the constant column vector (ci, C2, ...;€n). Then 
_ |All 

| A] ” 


vj 


Proof. We know the system of equations has a solution by Theorem 14.6. Then it is true 
that 


Q11 @1Q20 .  AYy—-1)  A11L1 + 41272 +... + Gintn Q1G41) + Gn 
G21 G22) «.  Agyj—1) 21%] 1 49Q%T2 1... FAINT AQi4+1) +++ Gn 
Ant Gn2 ++ An(i-1) Gni%1 T An2%2 +... + AnnIn An(it1) ++ Ann 
Qi1 @12 .. Cy .. Gin 
a21 @a22 ... C2 .. a2 ; F 
= "| = |A;|. However, by Theorem 14.5 it also follows that this 
Oni “GAd> a8 Gyo eo Opp, 
Qi, QQ «... LA ... Ain 
. a2] G92... Ujag .. ag |A;| 
is the same as ee Peg Al. Thus: oy = Ale 


Ant AnQ «-- Lani «..- Ann 


Theorem 14.8. Let A = [aij|mxn be a matrix and e; be the jth standard basis vector A; of 
R”". Then Ae; = Aj, the column matrix whose entries are those of the jth column vector. 
If B is a matrix so that Be; = A; for alll <j <n then A=B. 


n 
Proof. By definition of matrix multiplication, Ae; = DS Aikej,]1xm Where e;, is the kth 


k=1 
entry of e;. Since all entries except the jth entry are zero and the jth entry is one, this 


is equal to [aij]ixm = Aj. If we replace A by B then since every column of B is the 
corresponding column of A we know A = B. 


Theorem 14.9. Let A = [ajj]mxn be a matrix. Then the function T(x) = Ag is a linear 
transformation from R” to R™. 


14.1. MATRICES 465 


Proof. First, the fact that each point in R” can be mapped under this definition, and that 
the range would be in R”™ follows directly from the definition of matrix multiplication. 
nm n 


To show linearity, note that A(ax + by) = [- aiz(ate, + BYyK))ixm = al)” AikLkElixm + 


k=1 k=1 
nm 
BLD— aikyalixm: 
k=l 


Theorem 14.10. Let T : R” > R” be a linear transformation. Then there is exactly one 
matriz A = |aij]mxn $0 that T(x) = Ax for each « € R”. 


Proof. Define Aj = T(e;) where A; is the jth column vector of A. Then by Theorem 14.8, 
T(e;) = Ae; for each 1 < j <n. Since Ax is also linear by Theorem 14.9, for any vector 
xX =< 21,22, 73,...,02n >€ R” we have that T(x) = 71T(e,) +22T(e2)+...+4nT (en) = Ax. 
Since we must have A; = T(e;) for Ae; to be equal to T(e;), the choice of matrix A is 
unique. 


Theorem 14.11. Let A= [aay liens B= [bij |nxr; C= (aaees Then AB(C) = (AB)C. 


Proof. There are unique linear transformations T4,7p,Tc so that T¢o(x) = Cx for all 
x € R’, Tg(x) = Bx for all x € R", and T,4(x) = Ax for all x € R”. We also know 
that T4(Tp o To(x)) = (Ta 0 Tg)(To(x)) for all x € R*. This means that the matrices 
corresponding to these function compositions are the same. Since A(BC'x) = T4(TgoTc) (x) 
and (AB)Cx = T4 o Tp(Tco(x)), it follows that AB(C) = (AB)C. 


Alternately, by definition AB = S aindkj|mxr and (AB)C = 3 Cig Dina ) Cas wasees 
k=1 s=1 k=l 


On the other hand, BC = Se bisCsj]nxt, Which means that A( BC) = Qik 3 Distal: 


s=1 
Since these are equal, matrix multiplication is associative as desired. 


Definition 127 


We say that an n x n matrix A has an inverse A-! if AA7' = A-'A=T. We say 


that A is invertible if there is an inverse matrix for A. 


Theorem 14.12. Let A be ann xX n matrix so that for some matrix B it is true that 
AB=I. Then BA=I and B= A". 


Proof. Since AB = I by Theorem 14.11, we know that for any vector x € R” it is true that 
ABx = Ix = x. This means that the linear transformations Ax and Bx are inverses of 
one another. However, for any functions, the composition of a function with its inverse in 
either order is the identity. Hence, BAx = x, which means that BAx = Ix which implies 
that BA =I by Theorem 14.10 


466 CHAPTER 14. MULTIVARIABLE SUPPLEMENTAL MATERIALS 


Definition 128 


Let E(;,;) be the notation for the matrix obtained by switching the ith row and 
jth row of the identity matrix. This is the elementary matrix associated with the 
row operation of switching rows 7 and j. 


Let Eo be the notation of the matrix obtained by replacing the 1 entry in the 


ith row of the identity matrix by the non-zero number a. This is the elementary 
matrix associated with the row operation of multiplying the 7th row by a. 

Let nee be the matrix obtained by adding a times the ith row of the identity 
matrix to the jth row of the identity matrix. This is the elementary matrix associated 
with the row operation of adding a times the ith row of A to the jth row of A. 

Matrices of these three types are called elementary matrices. 


Theorem 14.13. (a) Let A = [ajj|nxn be ann by n matrix and let i,j € {1,2,3,...,n}. 


Then E(,3)A is the matrix obtained by switching the ith and jth rows of A, EQ) A is the 


matriz obtained by adding a times the ith row of A to the jth row of A, and EA is the 
matriz obtained by multiplying the ith row of A by a. 

(b) The matriz obtained by performing any finite sequence of row operations on A is the 
product of the corresponding elementary matrices outlined in part (a) times A. 

(c) The determinant of the product of elementary matrices is the product of the determinants 
of these matrices, which is always non-zero. 

(d) Every product of elementary matrices is invertible. 

(ec) If the determinant of a A is non-zero then the A is the product of elementary 
matrices. 

(f) The matriz A is invertible if and only if det(A) #0. 

(g) The equation the linear transformation Ax = O has a unique solution of x = O if 
and only if Ax is both one to one and onto, which is true if and only if det(A) 4 0. 

(h) If A and B are n by n matrices then det(AB) = det(A) det(B). 

(i) A matrix is invertible if and only if it is a product of elementary matrices. 


Proof. (a) First, observe that E(;,;)A has a one in the jth entry of the ith row and the other 
entries are zeroes. Thus, the kth entry of the ith row of the product matrix is e;-A,, where 
Ay is the kth column of A, which is equal to aj,. Similarly, the kth entry of the jth row of 
the product is e; - A, = a;,. For any other 1 < m <n the only non-zero entry of the mth 
row of E(;;) is 1 in the mth entry, so the kth entry of the mth row is am . 

Next, note that as above, all entries of rows of EO other than the ith row are the same 
as those of the identity matrix, so all rows of EO A are the same as those of A except 
for the ith row. In the ith row all entries are zero except for an @ in the ith position, so 
ae;- Ap = aa;z, is the kth entry of the ith row of the product for each 1 < k <n. 

(a 


Finally, as above, all entries of rows of EG ) other than the ith row are the same as 


) 
J 
those of the identity matrix, so all rows of EO A are the same as those of A apart from the 
ith row. In the ith row all rows as zero except for a 1 in the ith position and an a in the 
jth position. So, by definition of matrix multiplication, the kth entry of the ith row of the 


product is (e; + ae;) - Ap = aig + Aajp. 


14.1. MATRICES 467 


(b) It follows from (a) that performing a sequence of row operations results in the same 
matrix as multiplying by the associated row operations on the left sequentially, which is 
the same as multiplying by the matrix which is the product of such elementary matrices by 
Theorem 14.11. 

(c) We know from Theorem 14.5 that switching rows of a matrix negates the determinant, 
multiplying a row by a multiples the determinant by a@ and adding a multiple of a row to 


another row does not change the determinant (it multiplies the determinant by one). The 
determinant of I is one, so det(E(j)I) = det(Eyj)) = —1, det(E{2) 1) = det(E{))) ==, 
and det (EB! 1) = det (EO) = a #0 for elementary matrices E(;,;), EO, ie Performing 
these operations in sequence on J causes the multiplication by —1,1 or @ respectively in 
sequence to the resulting determinant, which means that the determinant of the product of 
elementary matrices is the product of their determinants. Since each determinant of each 
elementary matrix is non-zero, the determinant of the product is also non-zero. 

(d) Each elementary matrix is invertible. In particular EG) 

1 

Eo) is ES) and the inverse of EY) is ra Hence, for any product B = Fy, Fo E3...E, of 
elementary matrices, the product E, af Sarees 3 OF 1 Bo, 


= E(;j), the inverse of 


(ec) Let det(A) 4 0. Then we know by Theorem 14.6 that we can perform a sequence of 
row operations on A to change A to I. By part (b) we know that performing this sequence 
of row operations is the same as multiplying on the left the sequence of elementary matrices 
associated with these row operations. So, if changing A to J is done by a sequence of 
operations whose associated elementary matrices are £, Eo,..., Ey, then Ey, b,_1...£.E,A = 
I. Thus, A~! = E,,Ey_1...E2E, by Theorem 14.12, which means that A = | Dhaene is 

(f) If det(A) # 0 then we know that A can be converted to J by performing a finite 
sequence of row operations by Theorem 14.6. Hence, if these row operations are associated 
with multiplying by elementary matrices Ey, Eo,..., Fm, then 
EmEm—1..l1A = 1,80 EmEm_1...E, = A7'. If A is invertible then if Ax = 0 it must 
follow that A~'Ax = A~'0 = 0, so x = O. Since the solution to this is unique, by Theorem 
14.6, again, we know that det(A) 4 0. 

(g) If Ax = O has only the solution x = O then if Ax = Ay for some x,y we know 
that A(x — y) = 0 which means that x — y = 0 so x = y, so the linear transformation 
T(x) = Ax is one to one. Since the solution is unique, we also know that the determinant 
of A is non-zero which implies that for any vector c € R” there is a unique x so that Ax = c 
by Theorem 14.6. This means that f is both one to one and onto. This is also equivalent 
to det(A) £0 by Theorem 14.6 and part (b). 

(h) In (c), (e) we have established that the determinant of a matrix is non-zero if and 
only if it is the product of elementary matrices. Hence, if det(A) 4 0 and det(B) 4 0 then we 
know that A = A;Ao...Az for some elementary matrices Aj, Ag,...,A, and B = B,Bo...B; 
for elementary matrices By Bo...B,. By (c), 
det (AB) = det (A; Ao...A;B; Bo... By) — (det (A1) det (Ag)... det(A;,))(det (By det (Bg)... det(B;)) = 
det(A) det(B). If det(B) = 0 then we know that Bx = 0 has infinitely many solutions by 
Theorem 14.6 (since we know that x = 0 is a solution), so it has a non-zero solution p. 
Hence, ABp = AO = 0 which means that ABx = 0 has a non-zero solution, so det(AB) = 0. 
If det(B) # 0 and det(A) = 0 then there is some q 4 0 so that Aq = 0, and since B is 
invertible it follows that ABB~'q = Aq = 0. Thus, det(AB) = 0. 

(i) This follows from (d), (e), (c), and (f). 


468 CHAPTER 14. MULTIVARIABLE SUPPLEMENTAL MATERIALS 


14.2 Partitions of Unity 


Partitions of unity were not needed for any of the results in this text, but we include a 
development of them here because they are useful in Advanced Calculus for extending local 
properties to global ones. The format of this development largely parallels that of Spivak 


(1). 


Theorem 14.14. Let f(x) = oie if x #0 and let f(0) =0. Prove that f has derivatives of 


all orders, but is not analytic on any open interval containing 0. Furthermore, the functions 
A iL: 


gQaH=e 2 ae > O-and oe) =0ife <0 andh@) Se 2 fre and hs) =0 fe > 0 
are C®, and f'(0) = g™(0) = A\™ (0) = 0 for every natural number n. 


al 
Proof. Each derivative of f (n) (x) is a rational function multiplied by ez? everywhere except 


p(2) 


-1 
at x = 0. Tosee this, note that this is true for n = 0 and if the kth derivative is ex? then 


2 f = : _ 
f¥) (2) = (Pe) _ eee) he)a (2), a so the statement follows inductively. 


Gaya (q(x)? 
(x) Eye = — 0 
At x =0, let f™ (ax) = PAS) efor x #0. Then we have f‘”)(0) = lim ate) 
q(x) a0 06h —0 


(31 os eee! aa 4 ect zal ee ; 
— e «#2, which is a rational function times e «?. For any positive integer m is true 


q(x) : 
1 
that lim —~e # = lim ume by Theorem 4.14. By L’Hospital’s rule we know that 


xr0t X u>oo 


lim —- = 0 since differentiating the numerator m times leaves a derivative of m! and 
u>oo € 
m 


2 U 
“" < — for all 
eu 


m! 
differentiating the denominator still leaves e“ and lim — = 0. Since u'e~ 
u—-oo eu 


u> 1 it follows that lim u™e~“ =0 by the Squeeze Theorem for extended real numbers. 
U—->Oo 


ie 
Similarly, it follows that lim ——e PS 0. 
x>0- xe” 


If we choose m larger than the degree of xq(x) then lim = 0 since if k is the power 


«0 rq(x) 
of the lowest power summand C2* in xp(a) then dividing the numerator and denominator by 
x* gives us a numerator x””—* which approaches zero, and a denominator which approaches 

1 


al 
ex? < —e = for x sufficiently close to zero, it 
xrq(x) pe 


C' as x approaches zero. Since 0 < 


follows from the Squeeze Theorem that lim 
0 q(t) 


p(0). Thus, by the product rule for limits f'™ (0) = lim P(x) 

«0 xq(x) 

Thus, all derivatives of all orders for f(x) are zero at x = 0, so the Maclaurin series for 
f(x) is valid only at a single point. 

To see that the function g is C™ we first note that derivatives of all orders are zero at 

points x < 0 and derivatives of all orders exist at points x > 0 since g(x) = f(x) on (0,00). 


(n) — g(r) 
At 0 we observe that for each natural number n it is true that lim SE) = 


(n) tn) z—07 x—O 
ea Oand lim 2 Ey x f'"*D(0) = 0. Since h(x) = g(—2), it also follows 
x x0 z—0 


that h is C% and h\”)(0) = 0 for every natural number n by the chain rule. 


e~z? =0. We also know that lim p(t)= 
«2-0 


e-# =p(0)(0) =0. 


14.2. PARTITIONS OF UNITY 469 


Definition 129 


The support of a function f : R” > R is spt(f) = f—!(R \ {0}). 


We note that since the support of a function is closed, it is compact if and only if it is 
bounded. 


We next demonstrate that there is a C™ function that is positive on a given open interval 
and zero outside of that interval. 


Theorem 14.15. For any interval (a,b), there is a function wa») : R + R which is C° 
so that wiap)(x) > 0 for all x € (a,b) and wia»)(x) = 0 otherwise. 


—1 -—1 2 
Proof. We set wiap)(v) =e@- eG" if a € (a,b) and wap)(x) = 0 otherwise. Since this 
is just the product g(x — a)h(x — b) from Theorem 14.14, we know that w is C™ (by the 
product rule). 


We next observe that we can find a C® function for a given open interval (a,b) that 
is zero until the left end point of the interval, and one after the right end point of of the 
interval and positive in between. 


Theorem 14.16. Let (a,b) be an open interval and let wia,) : R + R be a C™ function 
so that wrap) (x) > 0 if x € (a,b) and wiap)(x) = 0 otherwise. Then there is a C® function 
hiap) $0 that hia») (x) = 0 if x <a and - (x) =lifx>bandhay (rz) >0ifa<ar<b. 


i. 


Wa 

ie By the Fundamental Theorem of Calculus we 
aw 

le oS is C™, so hia) is C°°. If e < a then since wiq4)(t) = 0 


ae 


for all t € |x,a], we know that [ wa,b)(t)dt = 0, so hiag)(z) = 0 as well. If x € (a,b) 
then since wqy) 1s continuous and positive we know that hi, »)(z) > 0 by Exercise 6.5. If 


Ja Wan (tat — fr wan(tat fy Mas) (t)at 


[owas (at SP ua (dt —f? walt) 
wap) (t) = 0 for all t € [6, z]. 


Proof. We define h(a)(x) = 


see that Aap) (2) = 


dt = 1+0 =O since 


x > 6 then hia) (x) = 


The next step is to move into R” by showing that we can find a function which is C® 
on a cube centered at a point which is one near the point, positive a little further from the 
point and then zero outside of a small cube about the point. Since the notion of an € cube 
is useful in more than one context, we will just define a notation for this idea first. 


470 CHAPTER 14. MULTIVARIABLE SUPPLEMENTAL MATERIALS 


Theorem 14.17. Let p = (pi, p2,p3,---;Pn) € R”, and let e > 0. Then there is a C™ 
function gp.) : R" > R so that gip)(#) > 0 if ee R.(p)° and gip.)(x) = 0 otherwise, and 
spt(9(p,.)) = R.(p). 


Proof. Let w(a,b) De as described in Theorem 14.15. For any x = (21, 2%2,%3,...,%), we 
define g(p =I (pepe (as). Then.g(5 4 is C™ since each wy. ¢9.42) 18 OC. Te 


ne then pj—€ < x; < pj +e for each i, which means that gip-)(x) > 0. Ifx ¢ Re(p)° 


then for some 7 it must be true that x; ¢ (pj—€, pi +€) which means that w(p,—¢p,+¢)(wi) = 9, 
SO J(p,<)(X) = 0. Since R. R.(p)° = R-(p) this means that spt(9(p,.)) = Re(p). 


In the field of topology, there is a theorem called Urysohn’s Lemma which shows that 
for a normal topological space and any two disjoint closed sets in that space, there is a 
continuous function from the space to the interval [0,1] which takes one closed set to zero 
and the other to one. The following theorem can be thought of as an analog to this lemma 
in N for functions which are C™ where one of the two closed sets is compact. 


Theorem 14.18. Let K be a compact subset of R” and let U be an open set containing 
Kk. Then there is aC® function fix): R" > R and an open V so that K CV C VcU 
and fx,y)(x) =1 for alae K,0< fixy)(z) <1 ifeeV and fix y)(z) =0 if eV, so 
spt(fuxu)) =V CU. 


Proof. Let 9(x,¢) and hq,» be as defined in theorems 14.17 and 14.16. 

By the Lebesgue Number Lemma we can find an € > 0 so that if the diameter of a set 
S' is less than or equal to 2e€ and S intersects K then S Cc U. 

Let C = {R.(x)°}xex. Since K is compact there is a finite FC C which covers K, where 


F = {R.(xi)°}i<i<cm. Define f(x Ya (x;,2¢) ( ) for all x € R”. Set V = U RgelX;)" 
4=1 


Then KC (J Rex)? CV CV CU. 


i=1 

If x € V then for some 7 it follows that x € Ro.(x;)° which means that f(x) > 
I(x;,26)(x) > O since x € Ro(xi)°. If x ¢ V then « ¢ Ro(x;)° for any natural number 
i < m, which means that 9x, 2¢)(x) = 0 for each i, so f(x) = 0. By the Extreme Value 
Theorem, since K is compact and f is positive on K, we know that f takes on a minimum 
value r on K. 

We define fixu)(x) = hior(f(x)). Then for all x € K we know that f(x) > r, so 
fiwuy(*) = hon (f(x)) = 1, and for all x ¢ V, we know that f(x) = 0, so f(x,y)(x) = 
ho r)(f(x)) = 0. If x € V then f(x) > 0 so fixyy)(x) = hon(f(x)) > 0. Hence, 
spt( frxu)) =VCcU. 


m 

To be more specific, spt(f(«,v)) = U R2,-(x;) in the previous argument, but equality 
i=1 
was not needed for the result in question. 


14.2. PARTITIONS OF UNITY A471 


It is helpful for certain theorems to have the notion of and Fy and G3 set. 


Definition 130 


The intersection of countably many open sets is a Gs set. The union of countably 


many closed sets is an F, set. If a set S is the union of countably many compact sets 
then the set is referred to as being o-compact. 


Now, in R” there isn’t any difference between the set of sets that are F, and the set of 
sets that are o-compact. 


Theorem 14.19. Every closed set is a Gs set in R”. For each open set U there are compact 


CO 
sets {Ki}ien so that ky C Ky C Ko C K3 C K3 C RQ... so that U = LU Ki. 
i=l 


Proof. First, let A be a closed set. Let V; = U Bi(x). Note that Vi C V; ifi > 7 and 


xEA 
[oe 


AC () V;, since every point of A is contained in an open ball contained in V; for each 2 € N. 


i=1 
Let p ¢ A. Then since A is closed we know that B1(p) A= 9. Hence, there is no point 


1 [oe 
of A within distance — of p and therefore p ¢ V,,. Thus, A = () V;, so every closed set is 
m 
i=1 
a Gs set. 
Let U be an open set. Let V; = U Bi(x). By the preceding argument we know 
xeER"\U 


lo) [oe 
that () V; = R” \ U, so by Demorgan’s Laws we know that Ll R”\ V; =U. Let Ky = 
i=l i=1 


m 
By(0)N U R” \ V; = Bn(0) OR” \ Vn for each m € N. Since each K,,, is an intersection 
i=1 
of closed sets and is therefore closed (and bounded since K,, C Bm(0)), it follows from the 
Heine-Borel Theorem that K,, is compact. 


(oe) 
For every x € U there is some k& so that x € B,(0) and x € R” \ Vz, so U k= U, 


n=1 


1 
i and let x € Bs(p). Then |x — 0| < 


1 
Let p € Ky for some m. Let 0 < 6 < — — 
mom 


|x — p| + |p — 0] < 1+ ™m, which means that x € Bm4+i(0). Also, if y € R” \ U then 


1 1 
ly —x| > ly -—p| —|p—x| => = 6> ao which means that x € R” \ Vin4i. Hence, 
Bs(p) C Km41 and p € K},,1. It follows that Ky C K3 C Ko C K3 C K3 C KQ.... 


An immediate consequence of this theorem is that open sets are o-compact (and thus 
F, sets). 


A472 CHAPTER 14. MULTIVARIABLE SUPPLEMENTAL MATERIALS 


We now define a useful idea for breaking a function down to a sum of simpler functions 
called a partition of unity. This will help us with the change of variables theorem. In 
general, it is useful for purposes of proofs extending local properties of functions to global 
properties. 


Definition 131 


Let S C R”. Let U be an open set containing S. Then a set ® of C™ functions 
ég : U + R, for 6 in some indexing set W, is a C™ partition of unity for S if ® 
satisfies the following conditions: 

(1) Ifx € S and ¢g € ® then g(x) € [0, 1]. 
(2) For each p € S there is an ep > 0 so that Fre...) = {6 € ®lspt(os)N Be, (p) FO} 
is finite. 


(3) For each x € S, the sum Se dp(x) = Se Ga(x) =, 

pew BEF (®,ep) 
If we replace co by a positive integer p in the preceding definition then ® is called a 
C? partition of unity for S. 
Let C = {Uahaez be an open cover of S. If ® also satisfies the following condition 
then ® is a partition of unity subordinate to C. 
(4) For each dg € ® there is some Ug € C so that spt(¢g) C Ua. 


A partition of unity for a set subordinate to a cover of that is a also a partition of unity 
for one of its subsets subordinate to the same cover. We go through this observation below. 


Theorem 14.20. Let S C R”, let RC S and let ® = {dg}aey be a partition of unity for 
S' subordinate to the cover C = {Ua}aes, where each dg : U — R and U is open in R”. Let 
D be an open cover of R so that each element of C' is contained in an element of D. Then 
® is a partition of unity for R subordinate to D. 


Proof. We check each condition. 

(1) Ifx € R then x € S, so for every $3 € ® we know ¢g(x) € [0,1]. 
(2) If p € R then p € S, so there is an €p > 0 so that Fie.) = {¢8 © ®lspt(os) N 
B.,(p) # 9} is finite. 

(3) For each x € R, we know x € S, so the sum py ba(x) = S- ¢g(x) =1. 

BEW OpEF (a,cp) 

(4) Since ® is subordinate to C' we know that for each ¢g € ® there is some Ug € C' so 
that spt(¢g) C U. C V for some V € D. 

Hence, we know that ® is a partition of unity for R subordinate to D. 


We next prove that we can find a partition of unity subordinate to any cover of any set. 


Theorem 14.21. Let C = {Ua}aey be an open cover of S CR”. Then there is a countable 
(finite if S is compact) C™ partition of unity ® = {¢g}gep for S which is subordinate to 
Cc. 


14.2. PARTITIONS OF UNITY 473 


Proof. We first show that we can find a finite partition of unity subordinate to C' assuming 
that S is compact. 

By the Lebesgue Number Lemma we can find € > 0 so that if a set Q of diameter less 
than or equal to 4e intersects S then Q C Ug, for some Ug € C and also @ C U. Let D = 
{B.(x)}xeg. Since S is compact, D has a finite subset E = {.B.(x;)}i<i<m which covers S. 
By Theorem 14.18, for each i € {1, 2,3, ...,7m} we can find an open set V; and a C® function 
F (BeGa), Boe (x:)) :R* so (0, 1] so that B.(xi) CV, CV; = spt f (Boa) Boe(xi))) Cc Boe(xi), 
where fiBGay ).Boe(x:)) =1i x6: B(x), F (Ba Boe(xi)) > 0 if and only if x € V; and 

m 


Bo.(x;) C Ug, for some a; € J. Setting V = U V,, it follows that SC VCV CU. 
i=1 
Let W = U B.(x:). By Theorem 14.18, we can find a C™ function f(s yw) : R" — [0,1] 
i=l 
so that figw)(x) = 1ifx € S and spt(fig wy) C W. 
For each natural number i < m, we one a function ¢; : R” — [0,1] by ¢;(x) = 


I(s,w) 2) FBGe),Bre(oi)) 0) 


sr — — ast) o The denominator ae Tee cH Y,Boe(x;)) 2) is non-zero on V and 
W CV, so each ¢; is C on W. If x ¢ W then x ¢ spt( f(g. w)), so we can find some 6 > 0 
so that Bs(x)N spt(fis,w)) = 0 (since spt(f(s.w)) is closed). Since f(s,w) is zero on Bs(x) 
we know that ¢; is also zero on Bs(x), which means that ¢; is C°° at x. Hence ¢; is C™ 
on R” as desired. 

Dj=1 FB), Bre(x))™) 


Ifx € S then fig w(x) =1, v0 Yb =m ja See 
J=1" (Be(xj),B2e(x;)) 


Next, we extend this to an Cae set S. Let D = Uc. By Theorem 14.19, we 
[oe) 


can find a sequence of compact sets Kk, C K5 C Ko C K3... so that U K; = D. By the 


i=l 
argument above, we can find finite partitions of unity ®; for Ay and ®2g for K2 subordinate 
to C, and ®; for each natural number i > 3, so that ®; is a partition of unity for kK; \ K?_, 
which is subordinate to the open cover Cy = {Ug M (Kj 41 \ Ki-2)|Ua € C} of Kk; \ Kp 
[oe) (oe) 


where each ¢ € U ®; is a C™ function so that ¢: R” > [0,1]. Let 6 = U ®;. 
i=1 i=1 
Since ® is countable, we can list ® = {1, ¢2, ¢3,...}. If x € D then let j be the first 
natural number so that x ¢ K?}. Then $,(x) > 0 for some ¢, € ®;. Choose y > 0 so that 
B,(x) C K}. For all i > j + 2 we know that By(x)M (K?,, \ Ki-2) = 0 wD means that 


there are only finitely many ¢; which are non-zero on B,(x). Hence s(x -> o;(x) isa 


finite sum of C®™ functions on Bs(x), which means that s(x) is a C™ positive finetion on D. 
For each i € N, define y; : D > R by v(x) = ae and let UV = {vj };cen. Since s is positive 
s(x 
on D it follows that each w; is C™, = ae = 1 for all x € D, and spt(y;) = spt(¢;) 
s(x 

which is a subset of an element of C' for each 7 € N, so W is a partition of unity for D 
subordinate to C’, and therefore a partition of unity for S subordinate to C by Theorem 
14.20. 


474 CHAPTER 14. MULTIVARIABLE SUPPLEMENTAL MATERIALS 


Theorem 14.22. Let C = {Ua}acs be an open cover of S C R” with C™ partition of 
unity ® = {1, ¢2, 63,...} for S where each ¢; is defined on some open set U D S and ® is 
subordinate to C. Let K be a compact subset of S. Then ®x = {¢; € ®|¢(x) > 0 for some 
ze Kk} is finite. 


Proof. For each p € K choose €p > 0 so that Fre.,) = {di € ®lspt(di) Be (p) # 
0} is finite. Then {B.,(p)}pex is an open cover of K which has a finite subcover F = 
{ Be, (P) }i<i<m- Since there are only finitely many ¢; which are non-zero on each Bey, (p;), 
it follows that there are only finitely many ¢; which are non-zero on K. 


We mentioned that partitions of unity are useful for taking local conditions and piecing 
them together to form global conditions. We conclude with an example to illustrate this 
notion. 


Definition 132 


We say an open cover C of an open set U in R” is admissible if W C U for every 
W €C. Let ® by a partition of unity subordinate to C. Let f : U > R be a function 


so that for each x € U there is some €x > 0 so that f is bounded on B,,(x) and 
the set of discontinuities of f has measure zero. We define f to be integrable in the 


extended sense if SS i ¢|f| converges. In this case, we define i = Ss" of. 
geo? U geo U 


Note that there is no problem with using } ¢|f| in the definition above since each ¢ 
U 
is zero outside of a Jordan region, and this notation just means the integral over a Jordan 


region contained in U on which f is non-zero. Also, the absolute convergence of > / of 
geo 


is implied by the convergence of S> | ¢|f|. One can demonstrate that this definition is 
U 


pe® 
independent of the particular partition of unity ® and cover C chosen, but we will not be 


pursuing this topic further in this text. 

The above definition allows us to extend the notion of integrability to an arbitrary open 
set rather than just a Jordan region, which is often interesting. A simple example of an 
instance where such an idea is of interest for single variable functions is the idea of an 
improper integral. 


14.2. PARTITIONS OF UNITY 475 
REFERENCES 
1. Michael Spivak, Calculus on Manifolds, Westview Press, 1965. 
2. William R. Wade, An Introduction to Analysis, fourth edition, Prentice Hall, 2009 


3. R. Creighton Buck, Advanced Calculus, International Series in Pure and Applied 
Mathematics, 1956 


4. Richard Courant, Fritz John, Introduction to Calculus and Analysis, Volume I, 
Springer-Verlag, 1974. 


Index 


C*, 278 

Cc”, 143 

C™, 143, 278 
Fy set, 471 
G5 set, 471 


Abel’s Formula, 223 

Absolute convergence, 220 

Absolute value, 16 

Absolute value of difference, 17 

Absolutely convergent series rearrangement, 
226 

Acceleration, 401 

Adding rational numbers, 19 

Alternating series remainder estimate, 224 

Alternating Series Test, 224 

Analytic, 239 

Angle, 251, 253, 254 

Antiderivative, 149, 151 

Approximation Property, 16 

Arc, 397, 398 

Area, 128 

Associativity, 9 

Average value, 361 


Basis, 262, 267 

Bezout’s Identity, 164 

Binary operation, 9 

Binomial Theorem, 35 
Bolzanoe-Weierstrass Theorem, 54 
Boundary, 263 

Bounded, 13, 18, 257 

Bounded above, 13 

Bounded below, 13 


Cantor’s Theorem, 176 
Cantor-Shroeder-Bernstein Theorem, 176 
Cardinality, 58, 171 

Cardinality inequalities, 175 

Cauchy Convergence Criterion, 214 
Cauchy Mean Value Theorem, 105 


Cauchy sequence, 56 

Cauchy-Schwarz Inequality, 256 

Chain Rule, 103, 284 

Change of variables for one variable, 190 

Circulation, 415 

Clairaut’s Theorem, 282, 310 

Closed in a set, 51, 62, 257 

Closed interval, 14 

Closed intervals are closed, 61 

Closed ray, 14 

Closed set, 51-53, 257 

Closure, 263 

closure, 185 

Closure of a set, 51 

Closure of operations, 12 

Codomain, 2 

Cofactor, 457 

Commutativity, 9 

Compact, 183, 266 

Comparison series remainder estimate, 225 

Comparison Test, 215 

Comparison Theorem for functions, 78 

Comparison Theorem for sequences, 47 

Complement, 3 

Completeness axiom, 15 

Component limits, 260 

Composite number, 37 

Composition, 2 

Conditional convergence, 220 

Conditionally convergent series rearrangement, 
228 

Connected, 185, 269 

Connected sets in R are intervals, 186 

Conservative vector field, 409 

Continuous, 71, 184, 257, 265 

Continuous functions are integrable, 135 

Continuous image of compact set, 184 

Continuous image of connected, 185 

Convergence, 45, 61, 257, 258 

Convergence of sequence of functions, 237 


476 


INDEX 


Convergent sequences are bounded, 47 
Convex, 269 

Coset, 14 

Countability of rational numbers, 60, 173 
Countable, 58, 172 

Countable unions of countable sets, 173 
Countably infinite, 58 

Cover, 183, 266 

Cramer’s Rule, 464 

Critical point, 291 

Cross product, 2, 254 

Cube, 318 

Curl, 424 

Curvature, 404, 405 

Curve, 287 

Cylindrical coordinates, 383 


Decimal expansion, 167 

Decimal sequence, 167 

Decreasing function, 49 

DeMorgan’s Laws, 19 

Dense, 185 

Density of rational numbers, 36, 61 

Derivative, 95 

Derivative is Jacobian, 280 

Derivative of logarithm, 147 

Derivative of multivariable function, 278 

Derivatives of trigonometric functions, 114 

Determinant, 455 

Determinant properties, 466 

Diameter, 257 

Differentiability implied by continuous partials, 
280 

Differentiable, 95 

Differentiable functions are continuous, 96 

Differential, 290 

Differentiation and integration of power series, 
241 

Dirichlet’s Test, 223 

Distance, 16, 251 

Distributivity, 9 

Divergence, 423 

Divergence Test, 215 

Divergence Theorem, 443 

Divides, 36 

Divisible, 36 

Divisor, 36 

Domain, 2 


ATT 


Dot product, 251, 252 
Dot product rule, 286 


Element, 2 

End points of interval, 14 

Error for Newton’s method, 147 
Euclid’s Lemma, 163, 164 

Euclidean space, 250 

Euclidean space is separable, 268 
Even integer, 37 

Exponential derivatives, 149 
Exponential functions, 148, 153 
Exponential properties, 153 
Extended interval, 14 

Extended real number, 177 

Extended real number operations, 181 
Extreme Value Theorem, 82, 185, 269 


Fermat’s Theorem, 104 

Field, 9 

Field axioms, 9 

Finite cardinalities unique, 172 

Finite set, 58 

First Derivative Test, 108 

First derivative test, 292 

First Mean Value Theorem for Integrals, 136 

First point, 13 

Flux through a curve, 416 

Flux through a surface, 431 

Frenet formulas, 406 

Function, 2 

Fundamental Theorem of Arithmetic, 165 

Fundamental Theorem of Calculus (first form), 
138 

Fundamental Theorem of Calculus (second 
form), 139, 140 


Generalized induction, 33 

Generalized Mean Value Theorem, 105 
Geometric series, 216 

Gradient vector field, 409 

Graph, 285 

Greatest common divisor, 37 

Greatest lower bound, 13 

Green’s Theorem, 418 

Grid, 319 


Half open interval, 14 
Heine-Borel Theorem, 183, 268 


478 


Homeomorphism, 184 


Identity, 9 

Identity matrix, 459 

Implicit Function Theorem, 296, 298 
Increasing function, 49 

Indefinite integral, 149 

Induction, 31, 161 

Inductive set, 160 

Infimum, 13 

Infinite limit operations, 179 
Infinite limits, 79, 176 

Infinite limits for sequences, 57 
Infinite set, 58 

Inner sum, 324 

Integer properties, 33 

Integrable, 128, 133, 134, 320 
Integral, 128, 134, 320, 338 

Integral notation, 137 

Integral series remainder estimate, 217, 219 
Integral Test, 217 

Integration by parts, 141 

Interior, 263 

Interior of a set, 51 

Intermediate Value Theorem, 82, 270 
Intersection, 3 

Intersection of closed sets, 54 
Interval, 14 

Interval of convergence, 239 
Inverse, 2, 9 

Inverse Function Theorem, 112, 294 
Inverse image, 2 

Inverse matrix, 465 

Inversion, 454 

Invertible matrix, 465 

Isolated point, 50, 257 

Iterated derivative, 95 


Jordan region, 327 


L’Hospital’s Rule, 110, 187 
Lagrange multipliers, 301, 304 
Last point, 13 

Least upper bound, 13 


INDEX 


Length of interval, 185 

Length, norm or magnitude of a vector, 251 
Limit, 257 

Limit at a point, 71 

Limit Comparison Test, 216 
Limit from the left or below, 81 
Limit from the right or above, 81 
Limit infimum, 222 

Limit point, 50, 61, 257, 258 
Limit supremum, 222 

Limits at infinity, 79 

Limsup, 246 

Lindelof, 267 

Line, 254 

Line integral of a function, 409 
Line integral of a vector field, 409 
Linear transformation, 453 

Local canonical form, 408 

Local extremum, 103, 291 

Local maximum, 103, 291 

Local minimum, 103, 291 

Log test, 225 

Logarithm properties, 147 
Logarithms, 147 

Lower bound, 13 

Lower integral, 128 

Lower sum, 127, 319 


Maclaurin series, 242 
Manifold boundary, 436 
Marking, 320 

Marking of a partition, 127 
Maximum, 13 

Mean Value Theorem, 105 


289 

Mean Value Theorem for Vector Functions, 
289 

Mesh, 127, 319 

Metric, 256 

Metric space, 256 

Minimum, 13 


Lebesgue Characterization of Riemann IntegrabNltyor, 457 


201 
Lebesgue measure zero, 196, 324 
Lebesgue Number Lemma, 186, 269 
Length of a curve, 400 


Monotone Convergence Theorem, 49 
Monotone function, 49 
Monotone functions are integrable, 152 


Multiplying rational numbers, 19 


Mean Value Theorem for Real Valued Functions, 


INDEX 


N choose k, 35 

Natural number properties, 31, 160 
Natural numbers, 160 

Natural numbers are unbounded, 35 


Neighborhood of a point, 51 
Non-decreasing function, 49 
Non-increasing function, 49 
Non-overlapping, 324 
Normal plane, 404 


Odd, 37 

One sided limits, 81 

One to one, 2 

Onto, 2 

Open ball, 257 

Open cover, 183, 266 
Open in a set, 51, 62, 257 
Open interval, 14 

Open intervals are open, 61 
Open ray, 14 

Open set, 51, 257 

Order axioms, 12 

Ordered pair, 3 

Ordering of reciprocals, 19 
Orientation, 409, 431 
Oriented closed curve, 417 
Orthogonal, 253 
Oscillation, 339 
Oscillation at a point, 197, 198 
Oscillation on a set, 197 
Osculating plane, 404 
Outer sum, 324 


p-series, 231 
Pairwise disjoint, 3 
Parallel, 253 
Parallelogram, 254 
Parallelpiped, 254 


Parametrization with respect to arc length, 


401 
Parametrized curve, 287 
Partial derivative, 278 
Partial sum, 148, 214 
Partition of an interval, 127 
Partition of unity, 472 
Path, 287 
Path connected, 269 


Natural numbers are well-ordered, 161 


Permutation, 454 

Perpendicular, 253 
Piecewise-smooth curve, 398 
Piecewise-smooth graph surface, 438 


Piecewise-smooth standard surface, 425 


Piecewise-type one and two, 419 


Piecewise-type one, two and three, 444 


Pigeonhole Principle, 174 
Plane, 253 

Polygonally connected, 269 
Polynomials are continuous, 85 
Power series, 239 

Power series uniqueness, 242 
Prime decomposition, 165 
Prime number, 37 

Product rule for derivatives, 101 
Product rule for function limits, 76 
Product rule for limits, 48 
Product rules for limits, 261 
Projection, 253 

Proper subset, 2 


Quotient, 164 

Quotient rule for derivatives, 101 
Quotient rule for function limits, 76 
Quotient rule for limits, 48, 261 


Radius of convergence, 239 
Range, 2 

Ratio Test, 220 

Rational exponents, 34 

Ray, 14 

Rearrangement, 226 
Rectangle, 318 

Rectifying plane, 405 
Recursive definition, 162 
Reduced terms, 166 
Refinement, 319 

Refinement of a partition, 127 
Regular curve, 397 

Regular surface, 425 
Remainder, 164 

Restriction of function to set, 81 
Riemann sum, 127, 320 
Rolle’s Theorem, 104 

Root Test, 223 

Row operations, 459 


Saddle point, 291 


480 


Second Derivative Test, 109 
Second derivative test, 292 
Separable, 185, 267 
Separation, 185, 269 
Sequence, 45, 257 


Sequential Characterization of Continuity, 259 


Sequential Characterization of Limit, 74 

Sequential Characterization of Limits, 178, 
259 

Series, 143, 214 

Set, 2 

Simple closed curve, 397, 398 

Simple smooth curve, 397 

Skew, 254 

Smooth closed curve, 397 

Smooth curve, 397 

Speed, 401 

Spherical coordinates, 384 

Squeeze Theorem, 180, 258 

Squeeze Theorem for functions, 77, 85 

Squeeze Theorem for sequences, 47 

Standard basis vector, 453 

Standard graph function, 436 

Standard smooth graph surface, 436 

Standard surface, 425 

Stirling’s Formula, 229 

Stokes’s Theorem, 438 

Strong induction, 33 

Subcover, 266 

Subordinate, 472 

Subsequence, 45 

Subset, 2 

Subspace, 51 

Sum rule, 286 

Sum rule for derivatives, 101 

Sum rule for limits, 48, 261 

Sume rule for function limits, 76 

Support, 469 

Suprema and infima of subsets, 20 

Supremum, 13 

Surface area, 429 

Surface integral, 429 

Symmetry, 256 


Tail of a sequence, 57 
Tail of series, 214 
Tangent line, 403 
Taylor remainder, 142 


INDEX 


Taylor series, 143, 231, 239 
Taylor’s Theorem, 142, 290 
Topology, 183 

Torsion, 404 

Trace of a parametrization, 287 
Transformation of Variables, 381 
Triangle Inequality, 17, 256 
Triangle inequality, 256 

Trichotomy on the real numbers, 12 


ncountability of real numbers, 59, 171 
ncountable, 58 

niform Cauchy Criterion, 237 

niform continuity, 83 

niform convergence, 237 

niform convergence preserves continuity, 237 
niform convergence preserves integral, 238 
niformly continuous, 267 

niformly differentiable set function, 375 
nion, 3 

nion of open sets, 53 

niqueness of identity, 10 

niqueness of inverses, 10 

nit binormal vector, 404 

nit normal vecor, 404 

nit tangent vector, 404 

pper and lower sum notation, 128 

pper bound, 13 

pper integral, 128 

pper sum, 127, 319 

rysohn’s Lemma for C, 470 


4 


4 


4 


4 


4 


4 


4 


4 


4 


4 


4 


4 


4 


4 


4 


4 


4 


GG Gl Gwe. Cet Cereal eG Greene ale 


4 


Vector, 250 
Vector field, 409 
Velocity, 401 
Volume, 318, 327 


Weierstrass M-test, 238 
Well ordered, 160 


Zero extension, 335 


